Skip to content

refactor(sim): remove PreemptionProcessingTime from LatencyModel interface#555

Merged
sriumcp merged 1 commit intoinference-sim:mainfrom
sriumcp:pr554-remove-preemption-processing-time
Mar 6, 2026
Merged

refactor(sim): remove PreemptionProcessingTime from LatencyModel interface#555
sriumcp merged 1 commit intoinference-sim:mainfrom
sriumcp:pr554-remove-preemption-processing-time

Conversation

@sriumcp
Copy link
Copy Markdown
Collaborator

@sriumcp sriumcp commented Mar 6, 2026

Summary

  • Remove PreemptionProcessingTime() from the LatencyModel interface (5→4 methods) — all three backends returned 0
  • Remove PreemptionEvent type — its Execute() was a no-op (debug log only)
  • Remove PreemptionDelay field from PreemptedRequest — always carried value 0
  • Replace PreemptionEvent scheduling with inline logrus.Debugf — preserves debug observability
  • Preserve PreemptionCount++ and all preemption mechanics (KV release, ProgressIndex reset, front-of-queue re-enqueue)

Behavioral Contracts

BC-1: Interface Simplification
GIVEN the LatencyModel interface
WHEN a new backend is implemented
THEN it needs to implement only 4 methods

BC-2: Preemption Metrics Preserved
GIVEN a simulation with KV cache pressure
WHEN preemption occurs
THEN PreemptionCount is still incremented

BC-3: Preemption Behavior Preserved
GIVEN a running request that is preempted
WHEN batch formation evicts it
THEN request is re-enqueued at queue front with ProgressIndex=0 and KV blocks released

BC-6: No Behavioral Change
GIVEN any simulation configuration
WHEN run before and after this change with the same seed
THEN the output is byte-identical (INV-6)

Test plan

  • go build ./... — clean
  • go test ./... -count=1 — all 11 packages pass
  • golangci-lint run ./... — 0 issues
  • Golden dataset tests pass unchanged (BC-5)
  • Grep confirms zero remaining references in Go source

Fixes #554

🤖 Generated with Claude Code

…rface (inference-sim#554)

Remove dead code: PreemptionProcessingTime() always returned 0 in all
three backends, and PreemptionEvent was a no-op event type (Execute()
only logged). The actual cost of preemption (re-prefill) is already
modeled by the ProgressIndex=0 reset in batch formation.

- Remove PreemptionProcessingTime() from LatencyModel interface (5→4 methods)
- Remove PreemptionEvent type from event.go
- Remove PreemptionDelay field from PreemptedRequest struct
- Replace PreemptionEvent scheduling with inline logrus.Debugf
- Preserve PreemptionCount++ metric and all preemption mechanics
- Update docs: CLAUDE.md, core-engine.md, latency-models.md,
  extension-recipes.md, design-guidelines.md, doc.go, FINDINGS.md

Fixes inference-sim#554

Co-Authored-By: Claude <[email protected]>
@sriumcp sriumcp force-pushed the pr554-remove-preemption-processing-time branch from 2438648 to 93f6401 Compare March 6, 2026 15:21
@sriumcp sriumcp merged commit eeb1848 into inference-sim:main Mar 6, 2026
4 checks passed
@sriumcp sriumcp deleted the pr554-remove-preemption-processing-time branch March 6, 2026 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove PreemptionProcessingTime from LatencyModel interface

1 participant