feat(chatd): add start_workspace tool to agent flow#22646
Merged
Conversation
When a chat's workspace is stopped, the LLM previously had no way to start it — create_workspace would either create a duplicate workspace or fail silently. This change adds a dedicated start_workspace tool that: - Detects if the chat's workspace is stopped and starts it via a new build with transition=start - Reuses the existing waitForBuild and waitForAgent helpers for the shared build-polling and agent-connectivity logic - Shares the workspace mutex with create_workspace to prevent races - Is idempotent: returns immediately if the workspace is already running or building Additionally, create_workspace's checkExistingWorkspace now returns a 'stopped' hint when it finds a stopped workspace, guiding the LLM to use start_workspace instead of creating a new workspace. Files changed: - coderd/chatd/chattool/startworkspace.go (new) - coderd/chatd/chattool/startworkspace_test.go (new) - coderd/chatd/chattool/createworkspace.go (stopped hint) - coderd/chatd/chatd.go (wire tool) - coderd/chats.go (chatStartWorkspace) - coderd/coderd.go (wire config)
Contributor
Documentation CheckUpdates Needed
Automated review via Coder Tasks |
johnstcn
reviewed
Mar 5, 2026
Comment on lines
+75
to
+99
| tpl := dbgen.Template(t, db, database.Template{ | ||
| OrganizationID: org.ID, | ||
| CreatedBy: user.ID, | ||
| }) | ||
| tv := dbgen.TemplateVersion(t, db, database.TemplateVersion{ | ||
| OrganizationID: org.ID, | ||
| TemplateID: uuid.NullUUID{UUID: tpl.ID, Valid: true}, | ||
| CreatedBy: user.ID, | ||
| }) | ||
| ws := dbgen.Workspace(t, db, database.WorkspaceTable{ | ||
| OwnerID: user.ID, | ||
| OrganizationID: org.ID, | ||
| TemplateID: tpl.ID, | ||
| }) | ||
| job := dbgen.ProvisionerJob(t, db, nil, database.ProvisionerJob{ | ||
| OrganizationID: org.ID, | ||
| CompletedAt: sql.NullTime{Time: dbtestutil.NowInDefaultTimezone(), Valid: true}, | ||
| }) | ||
| _ = dbgen.WorkspaceBuild(t, db, database.WorkspaceBuild{ | ||
| WorkspaceID: ws.ID, | ||
| TemplateVersionID: tv.ID, | ||
| JobID: job.ID, | ||
| Transition: database.WorkspaceTransitionStart, | ||
| BuildNumber: 1, | ||
| }) |
Member
There was a problem hiding this comment.
Comment on lines
+32
to
+33
| //nolint:gocritic // Unit test needs system context for DB seeding. | ||
| ctx = dbauthz.AsSystemRestricted(ctx) |
Member
There was a problem hiding this comment.
I'm not actually sure this is the case, looking at dbtestutil.NewDB.
johnstcn
approved these changes
Mar 5, 2026
- Add TestStartWorkspaceTool_EndToEnd in chatd_test.go: creates a
workspace, stops it, then verifies the start_workspace tool starts
it via a mock LLM chat flow.
- Refactor startworkspace_test.go per review feedback:
- Replace manual dbgen.ProvisionerJob + dbgen.WorkspaceBuild with
dbfake.WorkspaceBuild builder.
- Remove unnecessary dbauthz.AsSystemRestricted since dbtestutil.NewDB
returns an unwrapped database.Store.
After merging main, waitForAgent was renamed to waitForAgentReady with a new signature (returns map[string]any instead of error). Update waitForAgentAndRespond to use the new API.
The test does create+stop+start which involves more provisioner round-trips and agent connectivity checks. WaitSuperLong (60s) gives sufficient headroom vs WaitLong (25s).
The echo provisioner creates new agent rows per build, so the agenttest.New agent (connected to build 1's agent row) cannot serve build 3's agent row after stop/start. This caused waitForAgentReady to retry for 2 min, exceeding the test timeout. The start_workspace tool handles the no-agent case gracefully (returns started=true with agent_status=no_agent), so the test still validates the core start-workspace flow end-to-end.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When a chat's workspace is stopped, the LLM previously had no way to start it —
create_workspacewould either create a duplicate workspace or fail. This adds a dedicatedstart_workspacetool to the agent flow.Changes
New:
start_workspacetool (coderd/chatd/chattool/startworkspace.go)transition=startwaitForBuildandwaitForAgenthelpers (shared logic)create_workspaceto prevent racesno_agent/not_readystatus if the agent isn't available yet (non-fatal)Updated:
create_workspacestopped-workspace hintcheckExistingWorkspacenow returns astoppedstatus with message"use start_workspace to start it"when it detects the chat's workspace is stopped, instead of falling through to create a new workspaceWiring
chatd.Config/chatd.Server: newStartWorkspace/startWorkspaceFnfieldcoderd/chats.go: newchatStartWorkspacemethod that callspostWorkspaceBuildsInternalwith proper RBAC contextcoderd/coderd.go: passeschatStartWorkspaceinto chatd configcreate_workspacefor root chats only (not subagents)Tests (
startworkspace_test.go)NoWorkspace: error when chat has no workspaceAlreadyRunning: idempotent return for workspace with successful start buildStoppedWorkspace: verifies StartFn is called, build is waited on, and success response returned