Goal
Fit the 10 parameters (β₁–β₇, α₀–α₂) of the crossmodel latency model step-time model using only journey tracing data from the training corpus. No step tracing data (which has a semantic instrumentation bug — see inference-sim/vllm#49).
Context
- Step-time model design: inference-sim/inference-sim#489 (updated comment)
- Training data: 16 experiments in this repo (
default_args/)
- Train/test/validate split:
split.py (10 train, 3 validate, 3 test)
- Model configs:
model_configs/
- Hardware datasheet:
datasheets/h100-sxm.json
- Pydantic schemas:
schemas.py
Approach
Teacher-forced fitting: reconstruct the actual batch composition at every scheduler step from journey traces, compute predicted step times using the analytical basis functions, sum along each request's path, and compare against measured request-level times.
Design in follow-up comment.