Skip to content

fix(enterprise/coderd/x/chatd): harden TestSubscribeRelayEstablishedMidStream against CI flakes#24108

Merged
kylecarbs merged 1 commit intomainfrom
fix/flake-subscribe-relay-mid-stream
Apr 7, 2026
Merged

fix(enterprise/coderd/x/chatd): harden TestSubscribeRelayEstablishedMidStream against CI flakes#24108
kylecarbs merged 1 commit intomainfrom
fix/flake-subscribe-relay-mid-stream

Conversation

@kylecarbs
Copy link
Copy Markdown
Member

Fixes coder/internal#1455

Three changes to eliminate the timing-sensitive flake in TestSubscribeRelayEstablishedMidStream:

  1. Reduce PendingChatAcquireInterval from time.Hour to time.Second.
    The primary trigger is still signalWake() from SendMessage, but a
    short fallback poll ensures the worker picks up the pending chat
    even under heavy CI goroutine scheduling contention.

  2. Increase context timeout from WaitLong (25s) to WaitSuperLong (60s).
    The worker pipeline (model resolution, message loading, LLM call)
    involves multiple DB round-trips that can be slow when PostgreSQL
    is shared with many parallel test packages.

  3. Add a status-polling loop while waiting for the streaming request.
    If the worker errors out during chat processing, the test now
    fails immediately with the error status and message instead of
    silently timing out.

Generated by Coder Agents

…idStream against CI flakes

Three changes to eliminate the timing-sensitive flake:

1. Change PendingChatAcquireInterval from time.Hour to time.Second.
   The primary trigger is still signalWake() from SendMessage, but a
   short fallback poll ensures the worker picks up the pending chat
   even under heavy CI goroutine scheduling contention.

2. Increase context timeout from WaitLong (25s) to WaitSuperLong (60s).
   The worker pipeline (model resolution, message loading, LLM call)
   involves multiple DB round-trips that can be slow when the
   PostgreSQL instance is shared with many parallel test packages.

3. Add a status-polling loop while waiting for the streaming request.
   If the worker errors out during chat processing, the test now
   fails immediately with the error status and message instead of
   silently timing out after 60 seconds.

Fixes coder/internal#1455
@kylecarbs kylecarbs merged commit f3f0a2c into main Apr 7, 2026
28 checks passed
@kylecarbs kylecarbs deleted the fix/flake-subscribe-relay-mid-stream branch April 7, 2026 17:41
@github-actions github-actions bot locked and limited conversation to collaborators Apr 7, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

flake: TestSubscribeRelayEstablishedMidStream

2 participants