test

Run Unit Tests

SGLang uses the built-in library unittest as the testing framework.

Test Backend Runtime

cd sglang/test/srt

# Run a single file
python3 test_srt_endpoint.py

# Run a single test
python3 test_srt_endpoint.py TestSRTEndpoint.test_simple_decode

# Run a suite with multiple files
python3 run_suite.py --suite per-commit

Test Frontend Language

cd sglang/test/lang

# Run a single file
python3 test_choices.py

Adding or Updating Tests in CI

Create new test files under test/srt or test/lang depending on the type of test.
For nightly tests, place them in test/srt/nightly/. Use the NightlyBenchmarkRunner helper class in nightly_utils.py for performance benchmarking tests.
Ensure they are referenced in the respective run_suite.py (e.g., test/srt/run_suite.py) so they are picked up in CI. For most small test cases, they can be added to the per-commit-1-gpu suite. Sort the test cases alphabetically by name.
Ensure you added unittest.main() for unittest and sys.exit(pytest.main([__file__])) for pytest in the scripts. The CI run them via python3 test_file.py.
The CI will run some suites such as per-commit-1-gpu, per-commit-2-gpu, and nightly-1-gpu automatically. If you need special setup or custom test groups, you may modify the workflows in .github/workflows/.

CI Registry System

Tests in test/registered/ use a registry-based CI system for flexible backend/schedule configuration.

Registration Functions

from sglang.test.ci.ci_register import (
    register_cuda_ci,
    register_amd_ci,
    register_cpu_ci,
    register_npu_ci,
)

# Per-commit test (small 1-gpu, runs on 5090)
register_cuda_ci(est_time=80, suite="stage-b-test-small-1-gpu")

# Per-commit test (large 1-gpu, runs on H100)
register_cuda_ci(est_time=120, suite="stage-b-test-large-1-gpu")

# Per-commit test (2-gpu)
register_cuda_ci(est_time=200, suite="stage-b-test-large-2-gpu")

# Nightly-only test
register_cuda_ci(est_time=200, suite="nightly-1-gpu", nightly=True)

# Multi-backend test
register_cuda_ci(est_time=80, suite="stage-b-test-small-1-gpu")
register_amd_ci(est_time=120, suite="stage-a-test-1")

# Temporarily disabled test
register_cuda_ci(est_time=80, suite="stage-b-test-small-1-gpu", disabled="flaky - see #12345")

Choosing Between 1-GPU Suites (5090 vs H100)

When adding 1-GPU tests, choose the appropriate suite based on hardware compatibility:

Suite	Runner	GPU	When to Use
`stage-b-test-small-1-gpu`	`1-gpu-5090`	RTX 5090 (32GB, SM120)	5090-compatible tests (preferred)
`stage-b-test-large-1-gpu`	`1-gpu-runner`	H100 (80GB, SM90)	Large models or 5090-incompatible tests

Use stage-b-test-small-1-gpu (5090) whenever possible - this is the preferred suite for most 1-GPU tests.

Use stage-b-test-large-1-gpu (H100) if ANY of these apply:

Architecture incompatibility (SM120/Blackwell):
- FA3 attention backend (requires SM≤90)
- MLA with FA3 backend
- FP8/MXFP4 quantization (not supported on SM120)
- Certain Triton kernels (shared memory limits)
Memory requirements:
- Models >30B params or large MoE
- Tests requiring >32GB VRAM
Known 5090 failures:
- Weight update/sync tests
- Certain spec decoding tests

If a test cannot run on 5090 due to any of the above, use stage-b-test-large-1-gpu which runs on H100.

Available Suites

Per-Commit (CUDA):

Stage A: stage-a-test-1 (locked), stage-a-test-2, stage-a-test-cpu
Stage B: stage-b-test-small-1-gpu (5090), stage-b-test-large-1-gpu (H100), stage-b-test-large-2-gpu
Stage C (4-GPU): stage-c-test-4-gpu-h100, stage-c-test-4-gpu-b200, stage-c-test-4-gpu-gb200, stage-c-test-deepep-4-gpu
Stage C (8-GPU): stage-c-test-8-gpu-h20, stage-c-test-8-gpu-h200, stage-c-test-8-gpu-b200, stage-c-test-deepep-8-gpu-h200

Per-Commit (AMD):

stage-a-test-1, stage-b-test-small-1-gpu-amd, stage-b-test-large-2-gpu-amd

Nightly:

nightly-1-gpu, nightly-2-gpu, nightly-4-gpu, nightly-8-gpu, etc.

Running Tests with run_suite.py

# Run per-commit tests
python test/run_suite.py --hw cuda --suite stage-b-test-small-1-gpu

# Run nightly tests
python test/run_suite.py --hw cuda --suite nightly-1-gpu --nightly

# With auto-partitioning (for parallel CI jobs)
python test/run_suite.py --hw cuda --suite stage-b-test-small-1-gpu \
    --auto-partition-id 0 --auto-partition-size 4

Writing Elegant Test Cases

Learn from existing examples in sglang/test/srt.
Reduce the test time by using smaller models and reusing the server for multiple test cases. Launching a server takes a lot of time.
Use as few GPUs as possible. Do not run long tests with 8-gpu runners.
If the test cases take too long, considering adding them to nightly tests instead of per-commit tests.
Keep each test function focused on a single scenario or piece of functionality.
Give tests descriptive names reflecting their purpose.
Use robust assertions (e.g., assert, unittest methods) to validate outcomes.
Clean up resources to avoid side effects and preserve test independence.
Reduce the test time by using smaller models and reusing the server for multiple test cases.

Adding New Models to Nightly CI

For text models: extend global model lists variables in test_utils.py, or add more model lists
For vlms: extend the MODEL_THRESHOLDS global dictionary in test/srt/nightly/test_vlms_mmmu_eval.py

Name		Name	Last commit message	Last commit date
parent directory ..
lm_eval_configs		lm_eval_configs
manual		manual
registered		registered
srt		srt
unit		unit
README.md		README.md
pytest.ini		pytest.ini
run_suite.py		run_suite.py
run_suite_nightly.py		run_suite_nightly.py
show_partitions.py		show_partitions.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Run Unit Tests

Test Backend Runtime

Test Frontend Language

Adding or Updating Tests in CI

CI Registry System

Registration Functions

Choosing Between 1-GPU Suites (5090 vs H100)

Available Suites

Running Tests with run_suite.py

Writing Elegant Test Cases

Adding New Models to Nightly CI

FilesExpand file tree

test

Directory actions

More options

Directory actions

More options

Latest commit

History

test

Folders and files

parent directory

README.md

Run Unit Tests

Test Backend Runtime

Test Frontend Language

Adding or Updating Tests in CI

CI Registry System

Registration Functions

Choosing Between 1-GPU Suites (5090 vs H100)

Available Suites

Running Tests with run_suite.py

Writing Elegant Test Cases

Adding New Models to Nightly CI