Thank you for documenting all of this @dmeshcharakou just a few more suggestions/questions
Question: Is there reason we centralize for all repository types because of deduplication?
As far as I understand we went with a single table in rails for packages and that ended up causing a lot of scaling issues, this feel similar in a way?
What kind of deduplication benefits are we thinking of getting if we combine container & maven are the files ever that similar?
Question: Wouldn't this mean that we have a single partition on dedicated and self-managed?
I thought we went against that for blobs because it will essentially end up using a single partition, why is it ok for this?
Same question applies to the artifact file tables.
Question: Do we need sha1 at all?
Reading further seems like it adds more complexity
Question: Is it intentional that none of these have soft_deleted_at?
Question: Do we not support soft deletes for virtual repositories?
Meaning can I soft delete repositories? I see that we don't add the soft_deleted_at field in any of the repository
Adds a Subscription Expiration section to ADR-010 (Data Retention) under Cross-Cutting Concerns.
Defines a four-phase data lifecycle when a customer's Artifact Registry subscription expires. Reuses existing soft delete and garbage collection mechanisms rather than introducing new infrastructure.
Related: https://gitlab.slack.com/archives/C0A4EQE2109/p1773758043723019
Adds a Subscription Expiration section to ADR-010 (Data Retention) under Cross-Cutting Concerns.
Defines a four-phase data lifecycle when a customer's Artifact Registry subscription expires. Reuses existing soft delete and garbage collection mechanisms rather than introducing new infrastructure.
Related: https://gitlab.slack.com/archives/C0A4EQE2109/p1773758043723019
One other thing to monitor is request timeouts in Topology Service to see if HTTP Router is timing out requests to Topology Service (I know we had this monitored somewhere but forgot where)
This MR contains the following updates:
| Package | Change | Age | Confidence |
|---|---|---|---|
| @cloudflare/vitest-pool-workers (source) | 0.12.21 → 0.13.2 |
View the Renovate pipeline for this MR
v0.13.2
c9b3184, 13df6c7, df0d112, 81ee98e, c600ce0, f509d13, 3b81fc6, 0a7fef9]:
v0.13.1
ade0aed, 2b9a186, 65f1092, 7b0d8f5, 351e1e1, 2b9a186]:
v0.13.0
#11632 a6ddbdb Thanks @penalosa! - Support Vitest 4 in @cloudflare/vitest-pool-workers.
This a breaking change to the @cloudflare/vitest-pool-workers integration in order to support Vitest v4. Along with supporting Vitest v4 (and dropping support for Vitest v2 and v3), we've made a number of changes that may require changes to your tests. Our aim has been to improve stability & the foundations of @cloudflare/vitest-pool-workers as we move towards a v1 release of the package.
We've made a codemod to make the migration easier, which will make the required changes to your config file:
npx jscodeshift -t node_modules/@​cloudflare/vitest-pool-workers/dist/codemods/vitest-v3-to-v4.mjs vitest.config.ts
Or, without installing the package first:
npx jscodeshift -t https://unpkg.com/@​cloudflare/vitest-pool-workers/dist/codemods/vitest-v3-to-v4.mjs --parser=ts vitest.config.ts
Config API: defineWorkersProject and defineWorkersConfig from @cloudflare/vitest-pool-workers/config have been replaced with a cloudflareTest() Vite plugin exported from @cloudflare/vitest-pool-workers. The test.poolOptions.workers options are now passed directly to cloudflareTest():
Before:
import { defineWorkersProject } from "@​cloudflare/vitest-pool-workers/config";
export default defineWorkersProject({
test: {
poolOptions: {
workers: {
wrangler: { configPath: "./wrangler.jsonc" },
},
},
},
});
After:
import { cloudflareTest } from "@​cloudflare/vitest-pool-workers";
import { defineConfig } from "vitest/config";
export default defineConfig({
plugins: [
cloudflareTest({
wrangler: { configPath: "./wrangler.jsonc" },
}),
],
});
isolatedStorage & singleWorker: These have been removed in favour of a simpler isolation model that more closely matches Vitest. Storage isolation is now on a per test file basis, and you can make your test files share the same storage by using the Vitest flags --max-workers=1 --no-isolate
import { env, SELF } from "cloudflare:test": These have been removed in favour of import { env, exports } from "cloudflare:workers". exports.default.fetch() has the same behaviour as SELF.fetch(), except that it doesn't expose Assets. To test your assets, write an integration test using startDevWorker()
import { fetchMock } from "cloudflare:test": This has been removed. Instead, mock globalThis.fetch or use ecosystem libraries like MSW (recommended).
Vitest peer dependency: @cloudflare/vitest-pool-workers now requires vitest@^4.1.0.
This MR has been generated by Renovate Bot.
Thank you @dmeshcharakou this is a solid start, I left some non blocking questions approving this, but would love Sam's approval here as one of the domian experts around rate limits for GitLab.com to make sure we didn't miss anything else
This adds the implementation ADR to the Artifact Registry blueprint. See gitlab-org/gitlab#590282.
Related Issue: gitlab-org/gitlab#590369
Please verify the check list and ensure to tick them off before the MR is merged.
Document storage, artifact size, API rate, concurrency, and entity count limits for the Artifact Registry. Limits align with existing Package Registry and Container Registry defaults where applicable, and are enforced at the organization level consistent with ADR-001.
Question: Each plan tier (Free, Premium, Ultimate) Will Free users have the ability to buy Artifact Registry if not maybe we should remove it form this list?
We redirect downloads to GCS/S3 or CloudCDN/CloudFront, so this is hard to track.
@jdrpereira for CDN makes sense because we would have no knowledge of those request, but I'm assuming we'll have the knowledge that a user request blob X right? We would know the size of that blob and count it here?
Nice catch @pishel65 seems like it, closing!
Enable the Topology Service client for GitLab.com (cell-1) in production. This allows the Rails application to communicate with the Topology Service via mTLS certificates for cell routing functionality.
Can we link to where staging was done?
I might also be worth linking to the first attempt and mention what changed since then. Also some load testing we've done on Topology Service to show that we can handle the traffic.
- code block for this change
I'm not sure what we mean here, can you please double check the description?
Make sure pods are not crashing
Let's specify which pods exactly we need to check.
Merge
gprdMR
Do you think it would be worth phasing this out further and do us-east1-b first? So the rollout process will be:
gprd-cnygprd-us-east1-bgprd-us-esat1-c + gprd-us-east1-d (removing the overrides and using gprd.yaml.gotmpl).The goal is so that we ramp up the traffic to Topology Service and make sure we don't cause a downstream effect that causes issues and allows us to see if things are working as expected.
- Merge
gprdMR- Monitor deployment pipeline for any failures and wait for pipeline to finish
- Connect to
gke_gitlab-production_us-east1_gprd-gitlab-gkek8s cluster and watch changes ongitlabnamespace.
We are doing gprd but we are looking at the regional cluster, we already would have checked the regional cluster for gprd-cny right, why do we need to do that again?
Location: topology-rest Service RPS
This link seem to be invalid, I think you need to specify the env. There seem to be some other links like that.
- What changes to this metric should prompt a rollback: Error rate should not exceed the threshold.
I think we need to say Error rate exceeding threshold right, all the answers to "What changes to this metric should prompt a rollback" might need to be reworded a bit.
Key metrics to observe
We might also want to add Topology Service logs here and see if classifications are working as expected: https://dashboards.gitlab.net/goto/afgaltm7pt4aoe?orgId=1
Scheduled Date and Time -
2026-03-18 08:00(TBD, all depends on review status)
I think if we use stacked merge requests as suggested in gitlab-com/gl-infra/k8s-workloads/gitlab-com!5271 (comment 3166604029) we can get these pre-approved.
Some additional comments:
gprd-cny and the next one?@vglafirov can we stack this merge request so it's on top of Draft: feat: enabled cells config on gprd-cny (!5270) and have this merge request also remove the duplicate configuration we have in gprd-cny?
Enable the Topology Service client for GitLab.com (cell-1) in production.
global.appConfig.cell in gprd-cny.yaml.gotmplcell-1-production-mtls-cert secret (already deployed)Tested in staging - Topology Service client is healthy:
Gitlab::TopologyServiceClient::HealthService.new.service_healthy?
# => true
Steve Xuereb (3001694b) at 16 Mar 11:10