Skip to content

fix: exclude provisioner_state from workspace_build_with_user view#22159

Merged
sreya merged 20 commits intomainfrom
jon/provisioner-state
Feb 24, 2026
Merged

fix: exclude provisioner_state from workspace_build_with_user view#22159
sreya merged 20 commits intomainfrom
jon/provisioner-state

Conversation

@sreya
Copy link
Collaborator

@sreya sreya commented Feb 18, 2026

The provisioner state for a workspace build was being loaded for every long-lived agent rpc connection. Since this state can be anywhere from kilobytes to megabytes this can gradually cause the coderd memory footprint to grow over time. It's also a lot of unnecessary allocations for every query that fetches a workspace build since only a few callers ever actually reference the provisioner state.

This PR removes it from the returned workspace build and adds a query to fetch the provisioner state explicitly.

@sreya sreya force-pushed the jon/provisioner-state branch from 704926a to ebbebcf Compare February 20, 2026 00:53
@sreya sreya requested a review from Emyrk as a code owner February 20, 2026 02:01
@sreya sreya requested review from spikecurtis and removed request for Emyrk February 20, 2026 03:29
sreya added 19 commits February 24, 2026 04:16
Remove provisioner_state (1-5 MB Terraform state per workspace) from
the workspace_build_with_user view. This prevents loading multi-MB blobs
on every query that uses the view (~20+ callers), saving hundreds of MB
of pinned RAM at scale.

The 5 callers that actually need provisioner state now fetch it
explicitly via a new GetWorkspaceBuildProvisionerStateByID query.
Rework the dbauthz authorization for GetWorkspaceBuildProvisionerStateByID
to properly enforce policy.ActionUpdate on the template, matching the
actual security policy that was previously only enforced in the HTTP
handler (workspaceBuildState).

Changes:
- Rewrite the SQL query to JOIN through workspace_builds → workspaces →
  templates, returning template columns needed for RBACObject().
- Add RBACObject() method on GetWorkspaceBuildProvisionerStateByIDRow
  that returns rbac.ResourceTemplate with the correct ID, org, and ACLs.
- Replace the manual three-query dbauthz implementation with a single
  fetchWithAction call using policy.ActionUpdate.
- Remove the handler-level RBAC check from workspaceBuildState since
  dbauthz now handles it properly.
- Elevate wsbuilder's getState() to use dbauthz.AsProvisionerd context
  since internal state copying during build creation should not require
  template update permissions.
- Fix pre-existing rename: GetWorkspaceAgentAndLatestBuildByAuthToken →
  GetAuthenticatedWorkspaceAgentAndBuildByAuthToken (syncs with main).
Use workspace_builds.template_version_id → template_versions → templates
instead of workspace_builds.workspace_id → workspaces → templates. The
build already references its template version directly, so this is the
more natural join path to reach the template.
Revert to joining workspace_builds → workspaces → templates since
template_versions.template_id can be NULL in tests where the version
is created before the template.

Also change wsbuilder mock from Times(1) to AnyTimes() since
getState() short-circuits for orphan/explicit state paths.
Split the provisioner state mock out of withLastBuildFound into a
separate withLastBuildState helper with Times(1). Only tests that
actually reach getState() include it:
- Orphan tests skip getState() (returns nil early)
- DoNotModifyImmutables and StartWorkspaceWithLegacyParameterValues
  fail during parameter validation before reaching getState()
The wsbuilder is part of the API server, not a provisioner daemon.
AsSystemRestricted is the correct system context for internal
operations that bypass RBAC.
AsSystemRestricted does not have ActionUpdate on ResourceTemplate,
so it would fail the dbauthz check. AsProvisionerd does have this
permission, and semantically fits since the wsbuilder is preparing
state for the provisioner daemon. Added comment explaining the
elevation.
Introduces a dedicated AsWorkspaceBuilder dbauthz subject with minimal
permissions for the workspace builder:

- ActionRead on ResourceProvisionerDaemon (eligibility checks)
- ActionUpdate on ResourceProvisionerJobs (marking orphan jobs complete)
- ActionUpdate on ResourceTemplate (reading provisioner state)

This replaces the previous use of AsProvisionerd and
AsSystemReadProvisionerDaemons in wsbuilder, giving the workspace
builder its own least-privilege identity instead of borrowing the
provisioner daemon's broad permissions.
@sreya sreya force-pushed the jon/provisioner-state branch from f3081c6 to 0dd06c4 Compare February 24, 2026 04:23
@coder-tasks
Copy link
Contributor

coder-tasks bot commented Feb 24, 2026

Documentation Check

Looks Good

  • docs/admin/security/audit-logs.md — Correctly removes provisioner_state from the WorkspaceBuild tracked fields, matching the removal from enterprise/audit/table.go.
  • docs/ai-coder/agent-boundaries/nsjail/ecs.md — Manifest entry was restored in latest commit; file is now properly referenced in docs/manifest.json.
  • docs/manifest.json — Shared Workspaces state updated from beta to early access (reverted in c301f1b)

Automated review via Coder Tasks

@sreya sreya merged commit 0a7a3da into main Feb 24, 2026
29 checks passed
@sreya sreya deleted the jon/provisioner-state branch February 24, 2026 04:46
@github-actions github-actions bot locked and limited conversation to collaborators Feb 24, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants