sparkrun

Launch, manage, and stop LLM inference workloads on one or more NVIDIA DGX Spark systems — no Slurm, no Kubernetes, no fuss.

Get Running in 3 Steps

One command installs sparkrun, sets up a managed environment, configures tab completion, and starts the setup wizard.

uvx sparkrun setup

Setup your DGX Spark or Sparks with our best practices by following the wizard.

Pick a recipe, launch it. Your model is serving — Ctrl+C safely detaches from logs.

sparkrun run qwen3-1.7b-vllm

⚡

Pick a recipe, run it. sparkrun handles container orchestration, model distribution, and networking automatically.

🔗

Scale across DGX Sparks with --tp. Each Spark contributes one GPU — sparkrun handles InfiniBand/RDMA and NCCL configuration.

📋

YAML configs capture model, container, runtime, and defaults. Override anything at launch time — no config files to hunt down.

📦

Share and collaborate on recipes via git registries. Add community or private registries and search across all of them.

📊

Auto-detects model architecture from HuggingFace. Know whether your config fits on a single DGX Spark or how many you need before launching.

🔄

First-class support for vLLM, SGLang, and llama.cpp. Same CLI, same recipe format, different engines under the hood.

🤖

AI-assisted inference management. Claude learns your cluster and helps run, monitor, and stop workloads conversationally.

⌨️

Rich shell completions for Bash, Zsh, and Fish. Tab-complete commands, recipe names, cluster names, and options instantly.