feat(replay): closed-loop session sequencing for trace replay#896
feat(replay): closed-loop session sequencing for trace replay#896idan-friedman wants to merge 5 commits intoinference-sim:mainfrom
Conversation
- SequenceSampler replays a pre-recorded sequence of values in order, wrapping to the beginning on exhaustion - Safe for empty sequences (returns minimum token count 1) - Implements LengthSampler interface for use in SessionBlueprint Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Add optional ThinkTimeSampler field (LengthSampler) to SessionBlueprint - OnComplete samples from it when set; falls back to constant ThinkTimeUs - Zero-value nil is safe: existing code paths unchanged Co-Authored-By: Claude Opus 4.6 <[email protected]>
…ay (BC-5, BC-6, BC-7) - Groups trace records by SessionID, builds SessionBlueprint per session - Round-0 requests injected upfront; SessionManager drives follow-up rounds - SequenceSamplers replay exact trace token counts per round - Think time derived from inter-round arrival gaps (or overridden) - Validates consecutive RoundIndex; errors on gaps/duplicates - Non-session records returned as regular requests (BC-7) Co-Authored-By: Claude Opus 4.6 <[email protected]>
…losed-loop session replay Add --session-mode (fixed|closed-loop) and --think-time-ms flags to blis replay. Closed-loop mode calls LoadTraceV2SessionBlueprints to build per-session blueprints from the trace, injects only round-0 requests, and wires SessionManager.OnComplete as the onRequestDone callback so follow-up rounds arrive at completion + think time rather than from pre-baked timestamps. Co-Authored-By: Claude Opus 4.6 <[email protected]>
… review - Add explanatory comment for negative-gap clamp in LoadTraceV2SessionBlueprints (non-monotone timestamps clamped to 0 to prevent INV-3 violation in OnComplete) - Update CLAUDE.md with closed-loop replay examples (--session-mode, --think-time-ms) - Add implementation plan doc for PR 880 Co-Authored-By: Claude Opus 4.6 <[email protected]>
eb5ee7e to
e7fc57e
Compare
|
@claude /pr-review-toolkit:review-pr does this pr tackle the parent issue correctly? Validate correctness, ease of use, separation of concerns, adherence to our @docs/contributing/standards, invariants, behavioral discipline, behavioral tests and non-structural tests, modularity, clean api boundaries and contracts, performance implications, documentation correctness and completeness, including user and developer facing docs. |
|
Claude finished @sriumcp's task in 7m 41s —— View job PR Review: feat(replay): closed-loop session sequencing for trace replayTasks
Overall AssessmentThe PR correctly addresses the parent issue (#880). All four stated deliverables are implemented: Important Issues1. Think-time derived from arrival gaps encodes service time — undocumented semantic [Important]File: gap := rounds[i].ArrivalTimeUs - rounds[i-1].ArrivalTimeUsWhen a trace is produced by The gap is not pure think time — it includes the previous round's service time. Replaying without The docstring for 2. Double invocation of
|
| Rule | Check | Result |
|---|---|---|
| R1 (no silent data loss) | Non-session continue path appends requests ✓; empty sequence returns 1 ✓; session-not-found logs Warn+returns nil ✓ |
✓ |
| R2 (sort map keys) | sessionMap iteration in sort/validate loop doesn't produce output; output controlled by sessionOrder slice (insertion-ordered) ✓ |
✓ |
| R3 (validate numeric params) | replayThinkTimeMs < 0 guarded; replaySessionMode validated ✓ |
✓ |
| R4 (construction site audit) | SessionBlueprint{} zero-value for ThinkTimeSampler correctly falls back (nil check) ✓; ScheduledStepIdx omission noted above |
⚠ minor |
| R6 (no logrus.Fatal in library) | New code in sim/workload/replay.go returns errors ✓ |
✓ |
| R7 (invariant tests alongside golden) | No new golden tests added ✓ | ✓ |
| R8 (no exported mutable maps) | No new exported maps ✓ | ✓ |
| INV-3 (clock monotonicity) | Negative gaps clamped to 0 with comment ✓ | ✓ |
| INV-6 (determinism) | sessionOrder preserves insertion order; per-session RNG seeded from same rng.Int63() sequence ✓ |
✓ |
| INV-9 (oracle knowledge boundary) | MaxOutputLen: len(outputTokens) set from round-0 record (not oracle path) ✓ |
✓ |
| INV-10 (session causality) | arrivalTime = tick + thinkTime where thinkTime >= 0 always ✓ |
✓ |
| INV-11 (session completeness) | MaxRounds = len(rounds) ensures session terminates; all existing terminal paths preserved ✓ |
✓ |
Test Quality
Behavioral, non-structural tests: ✅ All 13 tests in the plan are implemented. They assert observable outputs (returned token counts, arrival times, blueprint counts) rather than internal struct layout. The TestSession_ThinkTimeSampler_* tests survive an implementation rewrite.
Test coverage gaps:
- Negative inter-round gap clamping (see issue getting started instructions #3)
- BC-12 warning path (no session records in trace)
SequenceSamplerindex overflow behavior (not critical givenMaxRoundsbounds calls, butintoverflow aftermath.MaxIntcalls would panic ons.index%len(s.values)if index wraps tomath.MinInt). For production use withMaxRoundscapped at trace length this is fine.
API Boundaries and Contracts
LoadTraceV2SessionBlueprints signature is clean and composable. The return of ([]*sim.Request, []SessionBlueprint, error) separates injection concerns cleanly. The SequenceSampler implementing LengthSampler is correct per R13 (the interface already has two implementations: ConstantSampler and now SequenceSampler). The optional ThinkTimeSampler field with nil-fallback is a clean extension point. No new interfaces were introduced unnecessarily.
Summary
| Priority | Count | Items |
|---|---|---|
| Important | 2 | Think-time semantics doc gap, double LoadTraceV2SessionBlueprints call |
| Minor | 3 | Missing clamping test, missing BC-12 test, struct alignment |
| Cosmetic | 1 | ScheduledStepIdx omission |
The PR correctly tackles the parent issue. Recommend merging after addressing the two Important items (especially the think-time documentation — users running the observe→replay pipeline need to know that --think-time-ms override is required to reproduce the actual client think time rather than service-time-inflated gaps).
Summary
--session-mode closed-looptoblis replay: session follow-up rounds arrive atcompletion_tick + think_timeinstead of pre-baked trace timestamps, enabling load-adaptive multi-turn simulation--think-time-msto override inter-round think time (0 = derive from trace inter-round arrival gaps)SequenceSampler— replays trace token counts per-round in order, wraps on exhaustion, safe for empty sequencesThinkTimeSampler LengthSamplerfield toSessionBlueprintfor per-round think time variation; nil falls back to constantThinkTimeUsLoadTraceV2SessionBlueprintswhich groups trace records bySessionID, builds deterministic blueprints withSequenceSamplerfor input/output token counts and think times, and injects only round-0 requests for initial DES injectionSessionManager.OnCompleteas theonRequestDonecallback so follow-up rounds are generated load-adaptivelyTest plan
TestSequenceSampler_ReplayInOrder— values returned in orderTestSequenceSampler_WrapsOnExhaustion— wraps to beginning after exhaustionTestSequenceSampler_SingleValue— single-element sampler always returns same valueTestSequenceSampler_EmptyValues— returns 1 (minimum) without panicTestLoadTraceV2SessionBlueprints_GroupsBySession— 2 sessions → 2 blueprints, 2 round-0 requestsTestLoadTraceV2SessionBlueprints_NonSessionPassThrough— non-session records included as-isTestLoadTraceV2SessionBlueprints_ThinkTimeFromTrace— inter-round gaps derived correctlyTestLoadTraceV2SessionBlueprints_SingleRoundSession— single-round session has nil ThinkTimeSamplerTestLoadTraceV2SessionBlueprints_OverrideThinkTime— override suppresses samplerTestLoadTraceV2SessionBlueprints_NonConsecutiveRoundIndex_Error— error on gap in round indicesTestSession_ThinkTimeSampler_UsedWhenPresent— sampler overrides constant think timeTestSession_ThinkTimeSampler_NilFallsBack— nil sampler falls back to ThinkTimeUsgo test ./... -count=1— all packages passgo build ./...— clean buildCloses #880
🤖 Generated with Claude Code