llm-engine

A pure Python implementation of Mini-SGLang using Cute-DSL.

Development

Installation

uv
prek

prek install

Setup

uv venv
source .venv/bin/activate
uv pip install modal==1.3.5
modal setup

Commands

For the shell and server clients, you can specify the following environment variables:

NNODES: number of nodes (1..4)
N_GPU: number of GPUs per node (1..8)
GPU_TYPE: GPU type
RDMA: whether to use RDMA (0 or 1)

For multi-node deployment on Hopper and Blackwell chips:

Your Modal workspace must have RDMA support.
You must pass --rdma to the commands below.

Run an interactive shell client:

modal run -i -m llmeng.shell

Serve an OpenAI-compatible API server:

modal serve llmeng/server.py

Run offline benchmarks:

modal run benchmark/offline/bench.py
modal run benchmark/offline/bench_wildchat.py

Run online benchmarks:

Deploy the server:

N_GPU=4 GPU_TYPE=h200 modal deploy llmeng/app.py

Run the benchmarks:

modal run benchmark/online/bench_qwen.py
modal run benchmark/online/bench_simple.py

Roadmap

port mini-sglang to Modal
replace nccl with penny
rewrite C++/CUDA/Triton in Cute-DSL
add speculative speculative decoding (SSD)

Credit

mini-sglang
Cute-DSL, Simon Vietner's blog posts: 1
Penny, worklogs 1, 2, 3
SSD, paper
Tristan Hume's blog post on profiling

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github/workflows		.github/workflows
assets		assets
benchmark		benchmark
llmeng		llmeng
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
ci.py		ci.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llm-engine

Development

Installation

Setup

Commands

Roadmap

Credit

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

llm-engine

Development

Installation

Setup

Commands

Roadmap

Credit

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages