@dbalexandre looks good
This code change adds support for tracking and replicating group file uploads in GitLab's Geo feature (which keeps multiple GitLab instances synchronized across different locations).
The changes create a new database table called "group_upload_states" that stores information about whether group uploads have been successfully copied and verified between different GitLab sites. This includes tracking when files were last checked, whether verification passed or failed, and retry counts for failed attempts.
The code also adds the necessary database migrations to create this new table with proper indexes for efficient querying, sets up foreign key relationships to link uploads with their parent groups, and includes sharding support for better performance in large deployments.
Additionally, it updates the GraphQL API to expose information about group upload replication status, adds new monitoring metrics so administrators can track how well group uploads are being synchronized, and updates documentation to reflect these new capabilities.
This enhancement extends GitLab's existing file replication system (which already handled project uploads) to also cover files uploaded at the group level, ensuring better data consistency and backup coverage across geographically distributed GitLab installations.
Related to #589910
This MR is one of many instances following !224245, which was produced by the same generator script. @dbalexandre has been improving the generator script with the MR feedback, and I expect he will continue to do so.
These are all behind a feature flag, so I propose that most release-blocking comments can be handled in a follow-up, which also addresses the generator script and any previous instances.
For more context, see !226569 (comment 3152345538).
rails db:migrate # on the primary
rails db:migrate:geo # on the secondary
# In Rails console on the primary
Feature.enable(:geo_group_upload_replication)
Feature.enable(:geo_group_upload_force_primary_checksumming)
Upload a file to a group (e.g., attach an image to a group-level issue or epic description). Alternatively, use the Rails console:
# In Rails console on the primary
group = Group.first
file = CarrierWaveStringFile.new_file(
file_content: "Seeded upload file in group #{group.full_path}",
filename: 'seeded_upload.txt',
content_type: 'text/plain'
)
UploadService.new(group, file, NamespaceFileUploader).execute
Verify the upload exists in the namespace_uploads partition:
Geo::GroupUpload.count
# Should be > 0
Wait for the verification worker to process, or trigger it manually:
# In Rails console on the primary
Geo::GroupUpload.first.replicator.verify
Geo::GroupUpload.first.group_upload_state.reload
Geo::GroupUpload.first.group_upload_state.verification_state
# Should be 2 (verification_succeeded)
Once the upload is created on the primary, Geo will automatically replicate it to the secondary. Check the sync status in the secondary Rails console:
# In Rails console on the secondary
Geo::GroupUploadRegistry.count
# Should be > 0
registry = Geo::GroupUploadRegistry.last
registry.state
# Should be 2 (synced)
If the registry is empty or not yet synced, you can manually trigger sync:
# In Rails console on the secondary
Geo::GroupUploadReplicator.new(model_record_id: Geo::GroupUpload.first.id).sync
# In Rails console on the secondary
registry = Geo::GroupUploadRegistry.last
registry.reload
registry.verification_state
# Should be 2 (verification_succeeded)
Note: You must be logged in as an admin user. Non-admin users will get
nullfor Geo-related queries.
Note: When querying from the secondary's GraphQL explorer, add a custom header
REQUEST_PATHwith the value `/api/v4/geo/node_proxy/{node_id}/graphql
Open the GraphQL explorer on the secondary instance (http://<secondary-url>/-/graphql-explorer) and run:
query {
geoNode {
name
primary
groupUploadRegistries {
nodes {
id
state
verificationState
groupUploadId
lastSyncedAt
verifiedAt
}
}
}
}
Expected result: you should see registry entries with state: "SYNCED" and verificationState: "VERIFIED".
Check the Geo Sites API includes the new group upload statistics:
curl --header "PRIVATE-TOKEN: <your-token>" "http://<primary-url>/api/v4/geo_sites/status"
Look for the new fields in the response:
group_uploads_countgroup_uploads_checksummed_countgroup_uploads_checksum_failed_countgroup_uploads_synced_countgroup_uploads_failed_countgroup_uploads_registry_countgroup_uploads_synced_in_percentagegroup_uploads_verified_in_percentageVisit /admin/geo/sites on the secondary and confirm that "Group Uploads" appears as a new data type with replication and verification progress.
Selective Sync Disabled:
Raw SQL
SELECT
"namespace_uploads".*
FROM
"namespace_uploads"
WHERE
"namespace_uploads"."id" BETWEEN 1 AND 10000;
Query Plan: https://explain.depesz.com/s/nH3V
Selective Sync by Groups:
Raw SQL
SELECT
"namespace_uploads".*
FROM
"namespace_uploads"
WHERE
"namespace_uploads"."id" BETWEEN 1 AND 10000
AND "namespace_uploads"."namespace_id" IN ( WITH RECURSIVE "base_and_descendants" AS (
(
SELECT
"geo_node_namespace_links"."namespace_id" AS id
FROM
"geo_node_namespace_links"
WHERE
"geo_node_namespace_links"."geo_node_id" = 2)
UNION (
SELECT
"namespaces"."id"
FROM
"namespaces",
"base_and_descendants"
WHERE
"namespaces"."parent_id" = "base_and_descendants"."id"))
SELECT
"id"
FROM
"base_and_descendants" AS "namespaces");
Query Plan: https://explain.depesz.com/s/62pU
Selective Sync by Organizations:
Raw SQL
SELECT
"namespace_uploads".*
FROM
"namespace_uploads"
WHERE
"namespace_uploads"."id" BETWEEN 1 AND 10000
AND "namespace_uploads"."namespace_id" IN (
SELECT
"namespaces"."id"
FROM
"namespaces"
WHERE
"namespaces"."organization_id" IN (
SELECT
"organizations"."id"
FROM
"organizations"
INNER JOIN "geo_node_organization_links" ON "organizations"."id" = "geo_node_organization_links"."organization_id"
WHERE
"geo_node_organization_links"."geo_node_id" = 2));
Query Plan: https://explain.depesz.com/s/nyun
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
@dbiryukov @pursultani I'll discuss the documentation changes with @Alexand, we'll keep you posted
@eread
Achilleas Pipinellis (a43acd8a) at 16 Mar 20:43
Achilleas Pipinellis (c2ff429b) at 16 Mar 20:43
Merge branch 'sk/document-pacakge-cloud-token' into 'master'
... and 1 more commit
doc: Document the source of PACKAGECLOUD_TOKEN variable
The incident https://gitlab.com/gitlab-com/gl-infra/production/-/work_items/21506 was caused because
it was unclear where the value of PACKAGECLOUD_TOKEN should come from. The value was set to a
custom master token for the gitlab/pre-release repository (from the web page
https://packages.gitlab.com/gitlab/pre-release/tokens) but this is incorrect. Master tokens can not
be used to push packages according to the PackageCloud
documentation
PackageCloud will be deprecated at the end of March 2026. This change keeps the documentation updated, in case we ever run into this in the future.
See Definition of done.
For anything in this list which will not be completed, please provide a reason in the MR discussion.
If you don't have access to this, the reviewer should trigger these jobs for you during the review process.
Trigger:ee-package jobs have a green pipeline running against latest commit.config/software or config/patches directories are changed, make sure the build-package-on-all-os job within the Trigger:ee-package downstream pipeline succeeded.Trigger:package:fips manual job within the Trigger:ee-package downstream pipeline must succeed.dev.gitlab.org to confirm regular branch builds aren't broken.10, duration 10s, URI scheme://user:passwd@host:port may require quotation or other special handling when rendered in a template and written to a configuration file.doc: Document the source of PACKAGECLOUD_TOKEN variable
The incident https://gitlab.com/gitlab-com/gl-infra/production/-/work_items/21506 was caused because
it was unclear where the value of PACKAGECLOUD_TOKEN should come from. The value was set to a
custom master token for the gitlab/pre-release repository (from the web page
https://packages.gitlab.com/gitlab/pre-release/tokens) but this is incorrect. Master tokens can not
be used to push packages according to the PackageCloud
documentation
PackageCloud will be deprecated at the end of March 2026. This change keeps the documentation updated, in case we ever run into this in the future.
See Definition of done.
For anything in this list which will not be completed, please provide a reason in the MR discussion.
If you don't have access to this, the reviewer should trigger these jobs for you during the review process.
Trigger:ee-package jobs have a green pipeline running against latest commit.config/software or config/patches directories are changed, make sure the build-package-on-all-os job within the Trigger:ee-package downstream pipeline succeeded.Trigger:package:fips manual job within the Trigger:ee-package downstream pipeline must succeed.dev.gitlab.org to confirm regular branch builds aren't broken.10, duration 10s, URI scheme://user:passwd@host:port may require quotation or other special handling when rendered in a template and written to a configuration file.@eread @marcel.amirault I used yarn add --registry https://gitlab.com/api/v4/packages/npm/ --dev @gitlab-org/gitlab-docs-vale-config in gitlab and it changed package.json
Are you saying that it shouldn't?
Btw, don't let me keep you from merging this
Achilleas Pipinellis (453d95c9) at 16 Mar 16:40
Achilleas Pipinellis (50af797f) at 16 Mar 16:40
Merge branch 'eread/tidy-up-markdown-in-docs' into 'main'
... and 1 more commit
For gitlab-org/technical-writing/team-tasks#1598, let's tidy up some Markdown formatting.
If you are a GitLab team member and only adding documentation, do not add any of the following labels:
~"frontend"~"backend"~"type::bug"~"database"These labels cause the MR to be added to code verification QA issues.
Documentation-related MRs should be reviewed by a Technical Writer for a non-blocking review, based on Documentation Guidelines and the Style Guide.
If you aren't sure which tech writer to ask, use roulette or ask in the #docs Slack channel.
Default behavior, say something like Default behavior when you close an issue.Configuring GDK, say something like Configure GDK.That said, I'm also ok to postpone the Omnibus config docs until we rollout the OAK implementation.
@Alexand yep, if it's not really tested and not recommended, let's not document this now. I'm not sure what percentage of users use things not recommended though, they might give good feedback
Thanks for the links, I read https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/selfmanaged_segmentation/ which cleared things up and I now understand where this Helm-only feature introduction fits in
Achilleas Pipinellis (f770c543) at 16 Mar 16:35
Achilleas Pipinellis (ea4649a2) at 16 Mar 16:35
Merge branch 'eread/tidy-up-markdown-in-docs' into 'master'
... and 1 more commit
For gitlab-org/technical-writing/team-tasks#1598, let's tidy up some Markdown formatting.
docs- or ends with -docs, so only the docs-related CI jobs are includedIf you are only adding documentation, do not add any of the following labels:
~"feature"~"frontend"~"backend"~"bug"~"database"These labels cause the MR to be added to code verification QA issues.
Documentation-related MRs should be reviewed by a Technical Writer for a non-blocking review, based on Documentation Guidelines and the Style Guide.
Default behavior, say something like Default behavior when you close an issue.Configuring GDK, say something like Configure GDK.For gitlab-org/technical-writing/team-tasks#1598, let's tidy up some Markdown formatting.
docs- or ends with -docs, so only the docs-related CI jobs are includedIf you are only adding documentation, do not add any of the following labels:
~"feature"~"frontend"~"backend"~"bug"~"database"These labels cause the MR to be added to code verification QA issues.
Documentation-related MRs should be reviewed by a Technical Writer for a non-blocking review, based on Documentation Guidelines and the Style Guide.
Default behavior, say something like Default behavior when you close an issue.Configuring GDK, say something like Configure GDK.For gitlab-org/technical-writing/team-tasks#1598, let's tidy up some Markdown formatting.
docs-, docs/ or ends with -docs, so only the docs-related CI jobs are includedIf you are only adding documentation, do not add any of the following labels:
~"feature"~"frontend"~"backend"~"bug"~"database"These labels cause the MR to be added to code verification QA issues.
Documentation-related MRs should be reviewed by a Technical Writer for a non-blocking review, based on Documentation Guidelines and the Style Guide.
Default behavior, say something like Default behavior when you close an issue.Configuring GDK, say something like Configure GDK.For gitlab-org/technical-writing/team-tasks#1598, let's tidy up some Markdown formatting.
If you are a GitLab team member and only adding documentation, do not add any of the following labels:
~"frontend"~"backend"~"type::bug"~"database"These labels cause the MR to be added to code verification QA issues.
Documentation-related MRs should be reviewed by a Technical Writer for a non-blocking review, based on Documentation Guidelines and the Style Guide.
If you aren't sure which tech writer to ask, use roulette or ask in the #docs Slack channel.
Default behavior, say something like Default behavior when you close an issue.Configuring GDK, say something like Configure GDK.