@pedropombeiro BE & devopsverify LGTM. Just left a nit.
Scopes user-scoped runner listings by the user's organizations so that multi-org users only see runners from organizations they belong to. The finder additionally accepts an optional organization_id parameter to narrow results to a specific organization.
Problem: The user-scoped path (User#ci_available_runners via GET /runners, currentUser { runners }) unions runners from all groups/projects the user has access to across all organizations. Once a user has multi-org membership, this query returns runners from every organization where they hold the required role -- with no filtering by organization.
Paths not affected:
Ci::Runner.all): Return all runners on the cell -- correct by design (admins are cell-scoped)can?(@current_user, :read_runners, @group/@project)) which verify the user has the required role on that specific resourceOrganization filtering is applied in two layers as a defense-in-depth measure:
Model layer (User#ci_available_runners): Always scopes results to runners belonging to the user's own organizations via .in_organization(organizations.pluck(:id)). The pluck resolves org IDs on the main DB first, avoiding cross-database subqueries. This ensures a user can never see runners from an organization they don't belong to, regardless of what the caller passes.
Finder layer (Ci::RunnersFinder#user_runners): When an organization_id parameter is provided (e.g. from Current.organization), narrows results further to that specific organization. This is the presentation filter for "show me runners in the current org context."
The model layer acts as a security boundary, while the finder layer provides the UX-level scoping. Even if a caller passes a malicious organization_id, the model layer prevents data leakage.
app/models/ci/runner.rb -- Add .in_organization(organization_id) scope. Accepts a scalar ID or an array. Instance runners (which have NULL organization_id by design) are naturally excluded by the WHERE clauseapp/models/user.rb -- ci_available_runners now chains .in_organization(organizations.pluck(:id)) to scope results to the user's own organizations. Uses pluck to resolve IDs on main DB, avoiding cross-database subqueries with the CI DBapp/finders/ci/runners_finder.rb -- Add user_runners method that optionally narrows results further by organization_id param when providedSince ci_runners lives on the CI database while organizations/organization_users live on the main database, organizations.pluck(:id) is resolved on the main DB first. The resulting array of org IDs is then passed as literals to the CI DB query. Normally, this will be a single value, since users normally belong to a single organization.
1. Original query -- no org filter (query plan):
SELECT COUNT(*) FROM ((SELECT "ci_runners".* FROM "ci_runners"
INNER JOIN "ci_runner_projects" ON "ci_runner_projects"."runner_id" = "ci_runners"."id"
WHERE "ci_runner_projects"."project_id" = ANY ('{...}'::integer[]))
UNION
(WITH "cte_namespace_ids" AS MATERIALIZED (
SELECT "ci_namespace_mirrors"."namespace_id" FROM "ci_namespace_mirrors"
WHERE ((traversal_ids[1]) = ANY ('{36425,36427,5268246}'::integer[])))
SELECT "ci_runners".* FROM "ci_runners"
INNER JOIN "ci_runner_namespaces" ON "ci_runner_namespaces"."runner_id" = "ci_runners"."id"
WHERE ci_runner_namespaces.namespace_id IN (SELECT namespace_id FROM cte_namespace_ids)))
ci_runners;
2. Model layer -- scoped to user's organizations (query plan):
SELECT COUNT(*) FROM (...same union...) ci_runners
WHERE "ci_runners"."organization_id" IN (<user's org IDs from main DB>);
3. Both layers -- model + finder organization_id param (query plan):
SELECT COUNT(*) FROM (...same union...) ci_runners
WHERE "ci_runners"."organization_id" IN (<user's org IDs>) AND "ci_runners"."organization_id" = 1;
Current.organization in runner Grape API endpoints)Not applicable -- backend service change only.
org_a = Organizations::Organization.first
org_b = create(:organization)
group_a = create(:group, organization: org_a)
group_b = create(:group, organization: org_b)
runner_a = create(:ci_runner, :group, groups: [group_a])
runner_b = create(:ci_runner, :group, groups: [group_b])
user = create(:user, owner_of: [group_a, group_b])
user.ci_available_runners.to_sql
# => includes WHERE organization_id IN (1, <org_b.id>)
Ci::RunnersFinder.new(current_user: user, params: { user: user, organization_id: org_a.id }).execute
# => returns only runner_a
bundle exec rspec spec/models/ci/runner_spec.rb -e 'in_organization' --format documentation
bundle exec rspec spec/finders/ci/runners_finder_spec.rb -e 'organization' --format documentation
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Changelog: other
Nit / Non-blocking:
We could use other_organization defined in the outer context since it already exists.
Hi @narendran-kannan Could you please do the first backend review here? Please pass to @mbobin for maintainer review if it looks okay to you. Thanks!
FF is enabled by default in specs.
FF is enabled by default in specs.
This is the same error being done in existing code paths in Ci::PlayBuildService and Ci::PlayBridgeService.
We need to change this from a bridge into a build because otherwise :play_job permission fails due to the user not having access to the downstream project. But we're not testing permissions here; we're only testing invalid transition, so we can just change the job type.
Leaminn Ma (ee92503f) at 17 Mar 21:16
Add access check in EnqueueJobService
Closed in favour of Add access check to Ci::EnqueueJobService & add... (!227714).
Relates to #524123
This MR significantly improves test coverage for Ci::EnqueueJobService to address the production incident and prevent future regressions.
spec/services/ci/enqueue_job_service_spec.rb (Enhanced)
StaleObjectError occursspec/services/ci/enqueue_job_service_authorization_spec.rb (New File)
:set_pipeline_variables abilityBefore: ~80 lines, basic coverage After: ~350+ lines in main spec + ~180 lines in authorization spec
These improvements address the production incident (gitlab-com/gl-infra/production#19377) by:
Created [FF] `ci_enqueue_job_authorization` -- Add auth... (#593898) for the changes in the MR: Add access check to Ci::EnqueueJobService & add... (!227714)
Prepare partitioning: swap PK for ci_build_needs
Changelog: changed
| Before | After |
|---|---|
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Related to #583634
Add defense in depth to this service
I agree with doing this. However, I'm concerned about making any change to an integral service in the CI. So I think this requires a feature flag just in case.
Bumping weight to 2.
Ci::EnqueueJobService is a central point for starting manual jobs for features like protected environments, but it doesn't have good test coverage as it was shown in gitlab-com/gl-infra/production#19377.
Moreover, it does not have defense in depth and relies on the caller to enforce authorization checks. We check the ability :play_job in Ci::PlayBuildService and Ci::PlayBridgeService. But it's not enforced in Ci::Pipeline#reset_source_bridge!.
This MR adds an access check inside Ci::EnqueueJobService so that it's applied no matter where the service is called. This prevents jobs from being enqueued in protected environments where the user does not have access.
Test coverage has been added for various environment/access scenarios. They follow the permission chart below (generated by Claude Code):
This change is made behind a feature flag: ci_enqueue_job_authorization. Roll-out issue: #593898
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Related to #524123
Leaminn Ma (11338bf7) at 17 Mar 17:15
Leaminn Ma (0bd3152b) at 17 Mar 17:03
Add defense in depth to EnqueueJobService
Progress this week:
Confidence for current milestone:
/cc @golnazs