AI Runway

Deploy and manage large language models on Kubernetes — no YAML required.

Note

AI Runway is still under heavy development and the APIs are not currently considered stable. Feedback is welcome! ❤️

AI Runway gives you a web UI and a unified Kubernetes CRD (ModelDeployment) to deploy models across multiple inference providers. Browse HuggingFace, pick a model, click deploy.

Demo

Highlights

🚀 One-Click Deploy — Browse models, check GPU fit, and deploy from the UI
🎯 Unified CRD — Single ModelDeployment API across all providers
🔧 Multiple Engines — vLLM, SGLang, TensorRT-LLM, llama.cpp
📈 Live Monitoring — Real-time status, logs, and Prometheus metrics
💰 Cost Estimation — GPU pricing and capacity guidance
🌐 Gateway API Integration — Unified inference endpoint via Gateway API Inference Extension with auto-detected setup
🔌 Headlamp Plugin — Full-featured Headlamp dashboard plugin

Supported Providers

Provider	Description	Provider Shim
NVIDIA Dynamo	GPU-accelerated inference with aggregated or disaggregated serving	dynamo.yaml
KubeRay	Ray-based distributed inference	kuberay.yaml
KAITO	vLLM (GPU) and llama.cpp (CPU/GPU) support	kaito.yaml
LLM-D	vLLM (GPU) with aggregated or disaggregated serving	llmd.yaml

Quick Start

Prerequisites

Kubernetes cluster with kubectl configured
helm CLI installed
GPU nodes with NVIDIA drivers (KAITO also supports CPU-only)

Option A: Run Locally

Download the latest release and run:

./airunway

Open http://localhost:3001

macOS: Remove quarantine if needed: xattr -dr com.apple.quarantine airunway

Option B: Deploy to Kubernetes

# Install CRDs and controller (required)
kubectl apply -f https://raw.githubusercontent.com/kaito-project/airunway/main/deploy/controller.yaml

# Install dashboard UI (optional)
kubectl apply -f https://raw.githubusercontent.com/kaito-project/airunway/main/deploy/dashboard.yaml
kubectl port-forward -n airunway-system svc/airunway 3001:80

Open http://localhost:3001 — see deployment docs for more options.

Getting Started

Install a provider shim — Apply one or more provider shims to register providers with AI Runway. See Supported Providers for available options.
Install the provider — Go to the Installation page and install the upstream provider via Helm
Connect HuggingFace — Sign in via Settings → HuggingFace (optional for non-gated models)
Deploy a model — Browse the catalog, pick a model, configure, and deploy
Monitor — Track status, stream logs, and view metrics on the Deployments page

Access Your Model

Deployed models expose an OpenAI-compatible API:

kubectl port-forward svc/<deployment-name> 8000:8000 -n <namespace>
curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "<model-name>", "messages": [{"role": "user", "content": "Hello!"}]}'

ModelDeployment CRD

apiVersion: airunway.ai/v1alpha1
kind: ModelDeployment
metadata:
  name: my-model
spec:
  model:
    id: "Qwen/Qwen3-0.6B"

The controller automatically selects the best engine and provider, creates provider-specific resources, and reports unified status. See CRD Reference for details.

Documentation

Topic	Link
Architecture	docs/architecture.md
CRD Reference	docs/crd-reference.md
Providers	docs/providers.md
Observability	docs/observability.md
Development	docs/development.md
Kubernetes Deployment	deploy/README.md
Gateway Integration	docs/gateway.md
Headlamp Plugin	docs/headlamp-plugin.md

Contributing

See CONTRIBUTING.md for development setup. We also accept AI-assisted prompt requests.

Name		Name	Last commit message	Last commit date
Latest commit History 143 Commits
.github		.github
backend		backend
controller		controller
demos/gateway-bbr		demos/gateway-bbr
deploy		deploy
docs		docs
frontend		frontend
images/model-downloader		images/model-downloader
plugins/headlamp		plugins/headlamp
providers		providers
shared		shared
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
agents.md		agents.md
bun.lock		bun.lock
bunfig.toml		bunfig.toml
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Runway

Demo

Highlights

Supported Providers

Quick Start

Prerequisites

Option A: Run Locally

Option B: Deploy to Kubernetes

Getting Started

Access Your Model

ModelDeployment CRD

Documentation

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Runway

Demo

Highlights

Supported Providers

Quick Start

Prerequisites

Option A: Run Locally

Option B: Deploy to Kubernetes

Getting Started

Access Your Model

ModelDeployment CRD

Documentation

Contributing

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages