Tarun Khandelwal activity https://gitlab.com/tkhandelwal3 2026-03-19T02:31:14Z tag:gitlab.com,2026-03-19:5220155100 Tarun Khandelwal commented on merge request !227043 at GitLab.org / GitLab 2026-03-19T02:31:14Z tkhandelwal3 Tarun Khandelwal [email protected]

Awesome work @marcogreg, apart from 1 nitpick the change LGTM and 2 of Duo's suggestion 🚀 🚀

I'm requesting a maintainer from @bmarjanovic to pottentially speed up the process here.

@bmarjanovic let us know if you won't have the bandwidth to take a look at this MR, thanks 🙇 🙏

tag:gitlab.com,2026-03-19:5220154825 Tarun Khandelwal approved merge request !227043: Introduce Cells claims verification workers and cron scheduling at GitLab.org / GitLab 2026-03-19T02:31:02Z tkhandelwal3 Tarun Khandelwal [email protected]

What does this MR do and why?

  • Add Cells::ClaimsVerificationWorker to run verification per model, gated by a dynamic ops feature flag per model name
  • Add Cells::ScheduleClaimsVerificationWorker as a cron job that enqueues a ClaimsVerificationWorker for every model registered via Cells::Claimable . This cronjob runs once every weekend 12am UTC for now to minimize database load.

This worker calls Cells::Claims::VerificationService introduced in !226233, to backfill and reconcile changes between Rails and Topology Service.

References

gitlab-com/gl-infra/tenant-scale/cells-infrastructure/team#468

How to set up and validate locally

  1. In console, run Rails.application.eager_load! to ensure all models have been loaded

  2. Enable the feature flags:

     %w[
        cells_claims_verification_worker_organizations_organization_model
        cells_claims_verification_worker_project_model
        cells_claims_verification_worker_namespace_model
        cells_claims_verification_worker_user_model
        cells_claims_verification_worker_key_model
        cells_claims_verification_worker_email_model
        cells_claims_verification_worker_gpg_key_model
        cells_claims_verification_worker_redirect_route_model
        cells_claims_verification_worker_route_model
        cells_claims_verification_worker_service_desk_setting_model
      ].each { |flag| Feature.enable(flag) }
  3. Run the Cells::ScheduleClaimsVerificationWorker

    Cells::ScheduleClaimsVerificationWorker.new.perform
  4. Check that the claims records are backfilled in topology service database:

    gdk psql -d topology_service
    
    SELECT * FROM claims;

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

tag:gitlab.com,2026-03-19:5220154790 Tarun Khandelwal commented on merge request !227043 at GitLab.org / GitLab 2026-03-19T02:31:00Z tkhandelwal3 Tarun Khandelwal [email protected]

nitpick: can we remove the _model suffix from the FF? They feel redudant and just having cells_claims_verification_worker_email reads better

tag:gitlab.com,2026-03-19:5220117570 Tarun Khandelwal commented on epic #12708 at GitLab.org 2026-03-19T02:09:47Z tkhandelwal3 Tarun Khandelwal [email protected]

@droarty2, we can't complete it by 10th July. We were initially expecting it to take a full quarter, i.e., till the end of July. Given that the milestone and quarter ends don't align very well. We went with 19.3.

I've updated the Due date and Milestone accordingly.

tag:gitlab.com,2026-03-19:5220105906 Tarun Khandelwal commented on merge request !227833 at GitLab.org / GitLab 2026-03-19T02:05:37Z tkhandelwal3 Tarun Khandelwal [email protected]

yeah the other reason why it's unlikely is because we run this rake task automatically as part of loading the DB, and since loading the DB happens before we start writing anything on the DB, this becomes really unlikely.

tag:gitlab.com,2026-03-19:5220101439 Tarun Khandelwal pushed to project branch tk/udpate-alter-cell-sequence-rake at GitLab.org / GitLab 2026-03-19T02:03:16Z tkhandelwal3 Tarun Khandelwal [email protected]

Tarun Khandelwal (2bffa9a3) at 19 Mar 02:03

update log to specify MINVALUE getting skipped.

tag:gitlab.com,2026-03-19:5220099987 Tarun Khandelwal commented on merge request !4 at GitLab.com / GitLab Infrastructure Team / GitLab Tenant Scale / Cells Infrastructure team / Cells Infrastru... 2026-03-19T02:02:28Z tkhandelwal3 Tarun Khandelwal [email protected]

Thanks @daveyleach for the work, just a few more comments.

tag:gitlab.com,2026-03-19:5220099880 Tarun Khandelwal commented on merge request !4 at GitLab.com / GitLab Infrastructure Team / GitLab Tenant Scale / Cells Infrastructure team / Cells Infrastru... 2026-03-19T02:02:26Z tkhandelwal3 Tarun Khandelwal [email protected]

question: why are we adding this incident, this was not on HTTP Router or Topology Service?

tag:gitlab.com,2026-03-19:5220099876 Tarun Khandelwal commented on merge request !4 at GitLab.com / GitLab Infrastructure Team / GitLab Tenant Scale / Cells Infrastructure team / Cells Infrastru... 2026-03-19T02:02:25Z tkhandelwal3 Tarun Khandelwal [email protected]

suggestion: Can we link / explain a bit on how these URLs are formed (managed_domain in the tenant model). and also link the upstream dedicated docs, given the user would also have to extract the creds to access these servers.

tag:gitlab.com,2026-03-19:5220099869 Tarun Khandelwal commented on merge request !4 at GitLab.com / GitLab Infrastructure Team / GitLab Tenant Scale / Cells Infrastructure team / Cells Infrastru... 2026-03-19T02:02:25Z tkhandelwal3 Tarun Khandelwal [email protected]

suggestion: Maybe we should link the config along with the ADR that helped us in finalizing these regions?

tag:gitlab.com,2026-03-19:5220099865 Tarun Khandelwal commented on merge request !4 at GitLab.com / GitLab Infrastructure Team / GitLab Tenant Scale / Cells Infrastructure team / Cells Infrastru... 2026-03-19T02:02:25Z tkhandelwal3 Tarun Khandelwal [email protected]

suggestion: Should we remove the k8s setup documentation from here? Given it's not really applicable for Cell pod access?

tag:gitlab.com,2026-03-19:5220099854 Tarun Khandelwal commented on merge request !4 at GitLab.com / GitLab Infrastructure Team / GitLab Tenant Scale / Cells Infrastructure team / Cells Infrastru... 2026-03-19T02:02:24Z tkhandelwal3 Tarun Khandelwal [email protected]

question: What do you mean by this? If it's how cloudflare environments are created and managed I think we should link the upstream documentation for that.

tag:gitlab.com,2026-03-18:5218034230 Tarun Khandelwal commented on issue #21546 at GitLab.com / GitLab Infrastructure Team / Production 2026-03-18T14:49:48Z tkhandelwal3 Tarun Khandelwal [email protected]

LGTM 🚀 🚀

Peer Approved

tag:gitlab.com,2026-03-18:5218014901 Tarun Khandelwal commented on epic #13532 at GitLab.org 2026-03-18T14:46:13Z tkhandelwal3 Tarun Khandelwal [email protected]

Hi @vyaklushin, we recently merged: gitlab-org/cells/topology-service!488 (merged), which adds support for classifying with claims in our TS.

We are still working on the documentation, but wanted to give you a heads up first.

PS: If you try this in prod right now it wouldn;t work as we have not backfilled the data yet, however that is something we would be doing in next couple of week, I'll give you a ping when that happens, thanks 🙇 🙏

tag:gitlab.com,2026-03-18:5217707648 Tarun Khandelwal commented on issue #21546 at GitLab.com / GitLab Infrastructure Team / Production 2026-03-18T13:48:01Z tkhandelwal3 Tarun Khandelwal [email protected]

@vglafirov overall looks good, I've few things that can be added / updated here.

All authenticated requests will have a cell-1 prefix added to the _gitlab_session cookie

This is not true; all the requests will go through the topology service, and not only authenticated requests, as we can see unauthenticated requests on staging also have that session prefix:

Screenshot_2026-03-18_at_6.31.42_PM

source

gitlab-com/gl-infra/k8s-workloads/gitlab-com!5271 (diffs)

Should the above MR remove the region-specific override, given that we are adding in the global config?

worth linking to the first attempt and mention what changed since then

I think we have missed this from Steve's comment from below, can we add it in the description, mentioning that the previous attempt failed and from then we have done the following to make sure it doesn't happen this time? and load testing link that we have validated that TS can handle 40k RPS in case caching doesn't work?

  • /chatops run canary --disable --gprd

Should this be updated to /chatops gitlab run canary --disable --gprd in light of the recent announcement?

Remove gitlab.com/* routing rule from CloudFlare configuration

Should we also link the config-mgmt link, given if someone runs TF apply this route will come back again?

ref: https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/blob/cef7158f9e92fae0e8aaa93f7ea78f74f9440dc5/environments/gprd/cloudflare-workers.tf#L9

Topology service logs

Given in this metric, the log volume be too high as soon as we enable this, should we add a filter to the logs saying level != info, so it's easier to see if any error happend?

ref: https://dashboards.gitlab.net/goto/efge6ngfv7u9sd?orgId=1

tag:gitlab.com,2026-03-18:5216455183 Tarun Khandelwal commented on issue #687 at GitLab.com / GitLab Infrastructure Team / GitLab Tenant Scale / Cells Infrastructure team / Cells Infrastructure ... 2026-03-18T09:21:38Z tkhandelwal3 Tarun Khandelwal [email protected]

If you can get the optional field in quickly and start socializing the need to provide a value with Switchboard, EA, and Pubsec, then you can piggy back on https://gitlab.com/groups/gitlab-com/gl-infra/gitlab-dedicated/-/work_items/876 to make it mandatory

In the spirit of this, I ended up opening: https://gitlab.com/gitlab-com/gl-infra/gitlab-dedicated/tenant-model-schema/-/merge_requests/632 post, which we can start socializing the need ot start providing a value with the different teams (along with exposing it in the Switchboard UI).