Skip to content

feat(workload): add multi-turn and advanced field parity to CohortSpec#848

Merged
sriumcp merged 4 commits intoinference-sim:mainfrom
sriumcp:pr20-cohort-multiturn
Mar 27, 2026
Merged

feat(workload): add multi-turn and advanced field parity to CohortSpec#848
sriumcp merged 4 commits intoinference-sim:mainfrom
sriumcp:pr20-cohort-multiturn

Conversation

@sriumcp
Copy link
Copy Markdown
Collaborator

@sriumcp sriumcp commented Mar 27, 2026

Summary

  • CohortSpec was missing 6 fields present on ClientSpec (PrefixLength, Reasoning, ClosedLoop, Timeout, Network, Multimodal), causing them to be silently dropped during cohort expansion and making multi-turn/reasoning workloads impossible to express via cohort syntax
  • Adds all 6 fields to CohortSpec with matching YAML tags and copies them in ExpandCohorts; pointer fields are shared across expanded clients (safe because GenerateRequests only reads them)
  • Adds validateCohort parity for PrefixLength >= 0, Timeout >= 0, and MaxRounds >= 1, matching the existing validateClient constraints

Test plan

  • 6 new behavioural tests in cohort_test.go:
    • TestCohortSpec_YAMLRoundTrip_Reasoning — strict YAML decode preserves MaxRounds/ThinkTimeUs (BC-4)
    • TestExpandCohorts_NewFieldsPropagate — all 6 fields propagate to every expanded client (BC-1, BC-5)
    • TestExpandCohorts_NilNewFields_NoChange — nil/zero fields don't inject accidental values (BC-6)
    • TestCohortValidation_InvalidMaxRounds_ReturnsErrormax_rounds=0 returns error (BC-2)
    • TestCohortValidation_NegativeTimeout_ReturnsError — negative timeout returns error (BC-3)
    • TestCohortValidation_NegativePrefixLength_ReturnsError — negative prefix_length returns error (BC-7)
  • Full suite passes: go test ./... — all packages green
  • Lint clean: golangci-lint run ./sim/workload/... — 0 issues
  • Pre-commit self-audit: INV-6 determinism, R4 constructors, R9 pointer types, separation of concerns — all clear

Fixes #847

🤖 Generated with Claude Code

sriumcp and others added 4 commits March 26, 2026 21:52
CohortSpec was missing 6 fields present on ClientSpec — PrefixLength,
Reasoning, ClosedLoop, Timeout, Network, and Multimodal — causing them
to be silently dropped during cohort expansion. This prevented multi-turn
and reasoning workloads from being expressed via cohort population syntax.

Adds the 6 fields to CohortSpec with matching YAML tags, copies them in
ExpandCohorts (pointer fields are safe to share; GenerateRequests is
read-only), and adds validateCohort parity checks for PrefixLength >= 0,
Timeout >= 0, and MaxRounds >= 1. Adds 6 behavioural tests (BC-1 through
BC-7) covering propagation, nil safety, YAML round-trip with strict
decoding, and all three validation constraints. Documents the new fields
in docs/guide/workloads.md with a warning about spike + multi-turn
lifecycle window interaction.

Fixes inference-sim#847

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…c fixes

Addresses post-review findings:

- Add TestCohortClientSpecFieldParity: reflection-based test that fails if a
  new ClientSpec field is added without mirroring in CohortSpec, preventing
  recurrence of the silent-drop bug class this PR fixes.

- Add TestGenerateRequests_CohortWithMultiTurn_ProducesMultiRoundRequests:
  end-to-end test verifying the core PR contract — a cohort with
  reasoning.multi_turn set produces requests with RoundIndex > 0 through the
  full GenerateRequests pipeline, plus INV-10 session causality check.

- Fix CohortSpec godoc: document the Lifecycle exclusion contract (synthesized
  from Diurnal/Spike/Drain; exposing both would create conflicting authority).

- Fix workloads.md timeout description: qualify "300 s default" as applying
  to multi-turn/session clients only (non-session nil timeout = no deadline).

- Fix workloads.md reasoning row: mention required reason_ratio_distribution
  sibling field to prevent confusing validation errors.

- Add category: reasoning to the YAML example for accuracy.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…alues

Two documentation/test bugs found during self-review:

1. context_growth: "append" is not a recognized value — the only branch in
   reasoning.go and session.go is "accumulate". Using "append" silently
   produces no context growth. Corrected to "accumulate" in both the
   workloads.md example and the end-to-end test.

2. reason_ratio_distribution value: 0.5 — the LengthSampler constant case
   does int(value), so 0.5 truncates to 0, yielding 0% reasoning ratio.
   The convention is a 0-100 percentage scale (divided by 100 in
   GenerateReasoningRequests). Corrected to value: 50 (50%) in the
   workloads.md example and the BC-4 YAML round-trip test.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Four issues from second convergence review:

- validateCohort was missing Weibull CV range [0.01, 10.4] and finite-
  positive checks that validateClient performs (R3: validate ALL numeric
  parameters). A cohort with arrival.process=weibull and an out-of-range
  cv would silently produce invalid expanded clients. Added matching
  checks after the arrival process check.

- Docs table: closed_loop type was listed as "bool" but is *bool — a
  tri-state field where nil uses a context-dependent default. Corrected
  to "*bool" and clarified the three-state semantics.

- Docs table: timeout null-default description was incomplete — it only
  mentioned the 300s default for closed-loop clients but omitted that
  open-loop clients with nil timeout get no deadline at all. Clarified.

- End-to-end test: the INV-10 check was asserting only monotonically
  increasing arrival times (weak). Strengthened to verify each round
  arrives at least ThinkTimeUs after the previous round, and that input
  token length strictly increases across rounds when context_growth is
  "accumulate".

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
@sriumcp sriumcp merged commit 96f6185 into inference-sim:main Mar 27, 2026
4 checks passed
namasl added a commit to namasl/inference-sim that referenced this pull request Mar 27, 2026
…ohort parity)

Merge upstream/main (inference-sim#845 normalized exponential arrival sampler, inference-sim#848 cohort
field parity) into feat/concurrency-flag. Conflicts in cohort_test.go (excluded
fields map — both sides added entries) and generator_test.go (both sides
appended new tests at EOF). Resolution: keep both sets of additions.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
vishakha-ramani pushed a commit to vishakha-ramani/inference-sim that referenced this pull request Mar 30, 2026
inference-sim#848)

* feat(workload): add multi-turn and advanced field parity to CohortSpec

CohortSpec was missing 6 fields present on ClientSpec — PrefixLength,
Reasoning, ClosedLoop, Timeout, Network, and Multimodal — causing them
to be silently dropped during cohort expansion. This prevented multi-turn
and reasoning workloads from being expressed via cohort population syntax.

Adds the 6 fields to CohortSpec with matching YAML tags, copies them in
ExpandCohorts (pointer fields are safe to share; GenerateRequests is
read-only), and adds validateCohort parity checks for PrefixLength >= 0,
Timeout >= 0, and MaxRounds >= 1. Adds 6 behavioural tests (BC-1 through
BC-7) covering propagation, nil safety, YAML round-trip with strict
decoding, and all three validation constraints. Documents the new fields
in docs/guide/workloads.md with a warning about spike + multi-turn
lifecycle window interaction.

Fixes inference-sim#847

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

* fix(workload): add parity drift guard, end-to-end multi-turn test, doc fixes

Addresses post-review findings:

- Add TestCohortClientSpecFieldParity: reflection-based test that fails if a
  new ClientSpec field is added without mirroring in CohortSpec, preventing
  recurrence of the silent-drop bug class this PR fixes.

- Add TestGenerateRequests_CohortWithMultiTurn_ProducesMultiRoundRequests:
  end-to-end test verifying the core PR contract — a cohort with
  reasoning.multi_turn set produces requests with RoundIndex > 0 through the
  full GenerateRequests pipeline, plus INV-10 session causality check.

- Fix CohortSpec godoc: document the Lifecycle exclusion contract (synthesized
  from Diurnal/Spike/Drain; exposing both would create conflicting authority).

- Fix workloads.md timeout description: qualify "300 s default" as applying
  to multi-turn/session clients only (non-session nil timeout = no deadline).

- Fix workloads.md reasoning row: mention required reason_ratio_distribution
  sibling field to prevent confusing validation errors.

- Add category: reasoning to the YAML example for accuracy.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

* fix(workload): correct context_growth and reason_ratio_distribution values

Two documentation/test bugs found during self-review:

1. context_growth: "append" is not a recognized value — the only branch in
   reasoning.go and session.go is "accumulate". Using "append" silently
   produces no context growth. Corrected to "accumulate" in both the
   workloads.md example and the end-to-end test.

2. reason_ratio_distribution value: 0.5 — the LengthSampler constant case
   does int(value), so 0.5 truncates to 0, yielding 0% reasoning ratio.
   The convention is a 0-100 percentage scale (divided by 100 in
   GenerateReasoningRequests). Corrected to value: 50 (50%) in the
   workloads.md example and the BC-4 YAML round-trip test.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

* fix(workload): Weibull CV validation parity and strengthen tests/docs

Four issues from second convergence review:

- validateCohort was missing Weibull CV range [0.01, 10.4] and finite-
  positive checks that validateClient performs (R3: validate ALL numeric
  parameters). A cohort with arrival.process=weibull and an out-of-range
  cv would silently produce invalid expanded clients. Added matching
  checks after the arrival process check.

- Docs table: closed_loop type was listed as "bool" but is *bool — a
  tri-state field where nil uses a context-dependent default. Corrected
  to "*bool" and clarified the three-state semantics.

- Docs table: timeout null-default description was incomplete — it only
  mentioned the 300s default for closed-loop clients but omitted that
  open-loop clients with nil timeout get no deadline at all. Clarified.

- End-to-end test: the INV-10 check was asserting only monotonically
  increasing arrival times (weak). Strengthened to verify each round
  arrives at least ThinkTimeUs after the previous round, and that input
  token length strictly increases across rounds when context_growth is
  "accumulate".

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

---------

Co-authored-by: Claude Sonnet 4.6 <[email protected]>
@sriumcp sriumcp deleted the pr20-cohort-multiturn branch March 31, 2026 03:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: cohorts do not support multi-turn (reasoning) workloads

1 participant