Cloud Native Stack (CNS) provides validated configuration guidance for deploying GPU-accelerated Kubernetes infrastructure. It captures known-good combinations of software, configuration, and system requirements and makes them consumable as documentation and generated deployment artifacts.
Running NVIDIA-accelerated Kubernetes clusters reliably is hard. Small differences in kernel versions, drivers, container runtimes, operators, and Kubernetes releases can cause failures that are difficult to diagnose and expensive to reproduce.
Historically, this knowledge has lived in internal validation pipelines, playbooks, and tribal knowledge. Cloud Native Stack exists to externalize that experience. Its goal is to make validated configurations visible, repeatable, and reusable across environments.
Cloud Native Stack is a source of validated configuration knowledge for NVIDIA-accelerated Kubernetes environments.
It is:
- A curated set of tested and validated component combinations
- A reference for how NVIDIA-accelerated Kubernetes clusters are expected to be configured
- A foundation for generating reproducible deployment artifacts
- Designed to integrate with existing provisioning, CI/CD, and GitOps workflows
It is not:
- A Kubernetes distribution
- A cluster provisioning or lifecycle management system
- A managed control plane or hosted service
- A replacement for cloud provider or OEM platforms
Earlier versions of Cloud Native Stack focused primarily on manual installation guides and playbooks. Those materials remain available under
/~archive/cns-v1. The current repository reflects a transition toward structured configuration data and generated artifacts.
Cloud Native Stack separates validated configuration knowledge from how that knowledge is consumed.
- Human-readable documentation lives under
docs/. - Version-locked configuration definitions (“recipes”) capture known-good system states.
- Those definitions can be rendered into concrete artifacts such as Helm values, Kubernetes manifests, or install scripts.- Recipes can be validated against actual system configurations to verify compatibility. This separation allows the same validated configuration to be applied consistently across different environments and automation systems.
For example, a configuration validated for gb200 on Ubuntu 22.04 with Kubernetes 1.29 can be rendered into Helm values and manifests suitable for use in an existing GitOps pipeline.
Some tooling and APIs are under active development; documentation reflects current and near-term capabilities.
Get started quickly with CNS:
- Review the documentation under
docs/to understand supported platforms and required components. - Identify your target environment:
- GPU architecture
- Operating system and kernel
- Kubernetes distribution and version
- Workload intent (for example, training or inference)
- Apply the validated configuration guidance using your existing tools (Helm, kubectl, CI/CD, or GitOps).
- Validate and iterate as platforms and workloads evolve.
These use cases reflect common ways teams interact with Cloud Native Stack.
Platform and Infrastructure Operators
You are responsible for deploying and operating GPU-accelerated Kubernetes clusters.
- Installation Guide – Install the cnsctl CLI (automated script, manual, or build from source)
- CLI Reference – Complete command reference with examples
- API Reference – Complete API reference with examples
- Agent Deployment – Deploy the Kubernetes agent to get automated configuration snapshots
Developers and Contributors
You are contributing code, extending functionality, or working on CNS internals.
- Contributing Guide – Development setup, testing, and PR process
- Architecture Overview – System design and components
- Bundler Development – How to create new bundlers
- Data Architecture – Recipe data model and query matching
Integrators and Automation Engineers
You are integrating CNS into CI/CD pipelines, GitOps workflows, or a larger product or service.
- API Reference – REST API endpoints and usage examples
- Data Flow – Understanding snapshots, recipes, and bundles
- Automation Guide – CI/CD integration patterns
- Kubernetes Deployment – Self-hosted API server setup
api/— OpenAPI specifications for the REST APIcmd/— Entry points for CLI (cnsctl) and API server (cnsd)deployments/— Kubernetes manifests for agent deploymentdocs/— User-facing documentation, guides, and architecture docsexamples/— Example snapshots, recipes, and comparisonsinfra/— Infrastructure as code (Terraform) for deploymentspkg/— Core Go packages (collectors, recipe engine, bundlers, serializers)tools/— Build scripts, E2E testing, and utilities~archive/— Archived v1 installation guides and playbooks
- Documentation – Documentation, guides, and examples.
- Roadmap – Feature priorities and development timeline
- Transition - Migration to CLI/API-based bundle generation
- Security - Security-related resources
- Releases - Binaries, SBOMs, and other artifacts
- Issues - Bugs, feature requests, and questions
Contributions are welcome. See contributing for development setup, contribution guidelines, and the pull request process.