Steve Xuereb activity

Steve Xuereb commented on merge request !18456 at GitLab.com / Content Sites / handbook

2026-03-17T16:20:40Z

Thank you for documenting all of this @dmeshcharakou just a few more suggestions/questions 🙇

Steve Xuereb commented on merge request !18456 at GitLab.com / Content Sites / handbook

2026-03-17T16:20:32Z

Question: Is there reason we centralize for all repository types because of deduplication?

As far as I understand we went with a single table in rails for packages and that ended up causing a lot of scaling issues, this feel similar in a way?

What kind of deduplication benefits are we thinking of getting if we combine container & maven are the files ever that similar?

Steve Xuereb commented on merge request !18456 at GitLab.com / Content Sites / handbook

2026-03-17T16:20:31Z

Question: Wouldn't this mean that we have a single partition on dedicated and self-managed?

I thought we went against that for blobs because it will essentially end up using a single partition, why is it ok for this?

Same question applies to the artifact file tables.

Steve Xuereb commented on merge request !18456 at GitLab.com / Content Sites / handbook

2026-03-17T16:20:31Z

Question: Do we need sha1 at all?

Reading further seems like it adds more complexity 🤔

Steve Xuereb commented on merge request !18456 at GitLab.com / Content Sites / handbook

2026-03-17T16:20:31Z

Question: Is it intentional that none of these have soft_deleted_at?

Steve Xuereb commented on merge request !18456 at GitLab.com / Content Sites / handbook

2026-03-17T16:20:31Z

Question: Do we not support soft deletes for virtual repositories?

Meaning can I soft delete repositories? I see that we don't add the soft_deleted_at field in any of the repository 🤔

Steve Xuereb approved merge request !18928: docs(Artifact Registry): add subscription expiration to ADR-010 at GitLab.com / Content Sites / handbook

2026-03-17T15:37:36Z

Adds a Subscription Expiration section to ADR-010 (Data Retention) under Cross-Cutting Concerns.

Defines a four-phase data lifecycle when a customer's Artifact Registry subscription expires. Reuses existing soft delete and garbage collection mechanisms rather than introducing new infrastructure.

Steve Xuereb approved merge request !18928: docs(Artifact Registry): add subscription expiration to ADR-010 at GitLab.com / Content Sites / handbook

2026-03-17T15:37:24Z

Adds a Subscription Expiration section to ADR-010 (Data Retention) under Cross-Cutting Concerns.

Defines a four-phase data lifecycle when a customer's Artifact Registry subscription expires. Reuses existing soft delete and garbage collection mechanisms rather than introducing new infrastructure.

Steve Xuereb commented on issue #21546 at GitLab.com / GitLab Infrastructure Team / Production

2026-03-17T14:56:22Z

One other thing to monitor is request timeouts in Topology Service to see if HTTP Router is timing out requests to Topology Service (I know we had this monitored somewhere but forgot where)

Steve Xuereb approved merge request !1123: chore(deps): update dependency @cloudflare/vitest-pool-workers to v0.13.2 at GitLab.org / cells / HTTP Router

2026-03-17T14:51:41Z

This MR contains the following updates:

Package	Change	Age	Confidence
@cloudflare/vitest-pool-workers (source)	`0.12.21` → `0.13.2`

View the Renovate pipeline for this MR

Release Notes

cloudflare/workers-sdk (@cloudflare/vitest-pool-workers)

Steve Xuereb commented on merge request !18619 at GitLab.com / Content Sites / handbook

2026-03-17T14:46:08Z

Thank you @dmeshcharakou this is a solid start, I left some non blocking questions approving this, but would love Sam's approval here as one of the domian experts around rate limits for GitLab.com to make sure we didn't miss anything else 🙇

Steve Xuereb approved merge request !18619: Add ADR-004: Data and Application Limits for Artifact Registry at GitLab.com / Content Sites / handbook

2026-03-17T14:46:05Z

Why is this change being made?

This adds the implementation ADR to the Artifact Registry blueprint. See gitlab-org/gitlab#590282.

Related Issue: gitlab-org/gitlab#590369

Author and Reviewer Checklist

Please verify the check list and ensure to tick them off before the MR is merged.

Provided a concise title for this Merge Request (MR)
Added a description to this MR explaining the reasons for the proposed change, per say why, not just what
- Copy/paste the Slack conversation to document it for later, or upload screenshots. Verify that no confidential data is added, and the content is SAFE
Assign reviewers for this MR to the correct
- The when to get approval handbook section explains when DRI approval is required
- The who can approve handbook section explains how to identify the DRI
- If the MR does not require DRI approval, consider asking someone on your team, such as your manager.
- The approver may merge the MR. If they approve but don't merge, you can merge.
For transparency, share this MR with the audience that will be impacted.
- Team: For changes that affect your direct team, share in your group Slack channel
- Department: If the update affects your department, share the MR in your department Slack channel
- Division: If the update affects your division, share the MR in your division Slack channel
- Company: If the update affects all (or the majority of) GitLab team members, post an update in #whats-happening-at-gitlab linking to this MR
  - For high-priority company-wide announcements work with the internal communications team to post the update in #company-fyi and align on a plan to circulate in additional channels like the "While You Were Iterating" Newsletter

Commits

Add ADR-004: Data and Application Limits for Artifact Registry

Document storage, artifact size, API rate, concurrency, and entity count limits for the Artifact Registry. Limits align with existing Package Registry and Container Registry defaults where applicable, and are enforced at the organization level consistent with ADR-001.

Steve Xuereb commented on merge request !18619 at GitLab.com / Content Sites / handbook

2026-03-17T14:46:04Z

Question: Each plan tier (Free, Premium, Ultimate) Will Free users have the ability to buy Artifact Registry if not maybe we should remove it form this list?

Steve Xuereb commented on merge request !18619 at GitLab.com / Content Sites / handbook

2026-03-17T14:46:03Z

We redirect downloads to GCS/S3 or CloudCDN/CloudFront, so this is hard to track.

@jdrpereira for CDN makes sense because we would have no knowledge of those request, but I'm assuming we'll have the knowledge that a user request blob X right? We would know the size of that blob and count it here?

Steve Xuereb commented on issue #4590 at GitLab.org / gitlab-runner

2026-03-17T13:49:19Z

Nice catch @pishel65 seems like it, closing!

Steve Xuereb closed issue #4590: GitLab CI Issue with Windows Server 2003 Runner at GitLab.org / gitlab-runner

2026-03-17T13:49:09Z

Steve Xuereb commented on issue #21546 at GitLab.com / GitLab Infrastructure Team / Production

2026-03-17T13:48:17Z

Enable the Topology Service client for GitLab.com (cell-1) in production. This allows the Rails application to communicate with the Topology Service via mTLS certificates for cell routing functionality.

Can we link to where staging was done?

I might also be worth linking to the first attempt and mention what changed since then. Also some load testing we've done on Topology Service to show that we can handle the traffic.

code block for this change

I'm not sure what we mean here, can you please double check the description?

Make sure pods are not crashing

Let's specify which pods exactly we need to check.

Merge gprd MR

Do you think it would be worth phasing this out further and do us-east1-b first? So the rollout process will be:

gprd-cny
gprd-us-east1-b
gprd-us-esat1-c + gprd-us-east1-d (removing the overrides and using gprd.yaml.gotmpl).

The goal is so that we ramp up the traffic to Topology Service and make sure we don't cause a downstream effect that causes issues and allows us to see if things are working as expected.

Merge gprd MR

Monitor deployment pipeline for any failures and wait for pipeline to finish

Connect to gke_gitlab-production_us-east1_gprd-gitlab-gke k8s cluster and watch changes on gitlab namespace.

We are doing gprd but we are looking at the regional cluster, we already would have checked the regional cluster for gprd-cny right, why do we need to do that again?

Location: topology-rest Service RPS

This link seem to be invalid, I think you need to specify the env. There seem to be some other links like that.

What changes to this metric should prompt a rollback: Error rate should not exceed the threshold.

I think we need to say Error rate exceeding threshold right, all the answers to "What changes to this metric should prompt a rollback" might need to be reworded a bit.

Key metrics to observe

We might also want to add Topology Service logs here and see if classifications are working as expected: https://dashboards.gitlab.net/goto/afgaltm7pt4aoe?orgId=1

Scheduled Date and Time - 2026-03-18 08:00 (TBD, all depends on review status)

I think if we use stacked merge requests as suggested in gitlab-com/gl-infra/k8s-workloads/gitlab-com!5271 (comment 3166604029) we can get these pre-approved.

Some additional comments:

Should we have some back time between each rollout for example waiting 15m-30m between gprd-cny and the next one?
Can we provide escalation path, after the change is done and we suspect a problem is related to this who do we contact?

Steve Xuereb commented on merge request !5271 at GitLab.com / GitLab Infrastructure Team / Kubernetes Workloads / GitLab.com

2026-03-17T13:31:20Z

@vglafirov can we stack this merge request so it's on top of Draft: feat: enabled cells config on gprd-cny (!5270) and have this merge request also remove the duplicate configuration we have in gprd-cny?

Steve Xuereb approved merge request !5270: Draft: feat: enabled cells config on gprd-cny at GitLab.com / GitLab Infrastructure Team / Kubernetes Workloads / ...

2026-03-17T13:23:47Z

Summary

Enable the Topology Service client for GitLab.com (cell-1) in production.

Changes

Configure global.appConfig.cell in gprd-cny.yaml.gotmpl
Enable mTLS using the cell-1-production-mtls-cert secret (already deployed)

Parent issue: gitlab-com/gl-infra/tenant-scale/cells-infrastructure/team#553
Change Management: gitlab-com/gl-infra/production#21546
Staging MR: !5064

Testing

Tested in staging - Topology Service client is healthy:

Gitlab::TopologyServiceClient::HealthService.new.service_healthy?
# => true

Steve Xuereb deleted project branch renovate/common-ci-tasks at GitLab.org / labkit

2026-03-16T11:10:41Z

Steve Xuereb (3001694b) at 16 Mar 11:10