Thanks so much for this! Great work.
@gitlab-bot label frontend backend
Even professional Linux and Git administrators can easily be surprised by their /gitlab/var/opt/gitlab/gitlab-rails/shared/registry/docker/registry/v2/blobs/sha256 folder filling up, Gitlab spitting back HTTP 500 / HTTP 502 errors, and runners returning no space left on device errors before everything comes screeching to a halt on what's probably the busiest work day of the year.
It doesn't need to be this way.
It may feel optional to enterprise admins who have tons of resource monitoring and alerts set up and who have a policy of just adding more resources constantly because surely venture capital money will keep things running forever, but this is very not optional for small admins and businesses who only manage Gitlab part time and trust Gitlab's sleek UI and tech stack to notify them of any potential issues. The fact that there's no 50/70/90% disk space warnings, no easy recovery or login methods when the disk is full, and barely-visible documentation about preventing disk space problems is a huge issue for the kinds of self-hosters who are interested in open source VCS.
Firstly, the entire section on maintaining your Gitlab installation appears to omit garbage collection and acts like cleanup policies are sufficient. But cleanup policies are cosmetic: untagging images makes the GUI visually uncluttered but won't save you from catastrophic disk space exhaustion.
https://docs.gitlab.com/ee/administration/housekeeping.html
Secondly, even if you read the entire document on setting up and maintaining Container Registry (and presumably other Registries, Artifact repositories, etc have same or similar issues... we know that Runner itself can have other similar Docker issues if you spin up too many jobs without a quick enough docker image prune which is a similar but separate issue to garbage collecting the Registry) you would have to read skeptically and in great detail to catch the little quiet part tucked away in the section that suggests that maybe you might think about possibly pruning untagged manifests. When really this needs to be FRONT AND CENTER IMPORTANT INFORMATION: IF YOU SET UP CI/CD TO AUTO-GENERATE DOCKER IMAGES OR OTHER ASSETS AND DON'T ALSO PRUNE UNTAGGED IMAGES, YOUR HARD DRIVE WILL EXPLODE.
Thirdly, some indication of the need for garbage collection (or some visibility to power users of the space taken up by untagged / "thought we deleted it but not really deleted" resources) inside the GUI would be very helpful. Gitlab and the docs repeat ad nauseam that setting up cleanup rules is the way to go, as if that's the answer to the problem. It's not, it mostly just hides the problem.
Finally, I'd suggest also kicking this ticket over to the product team to produce a code iteration focused on basic resource monitoring/alerting and enhanced recovery when resources are permanently maxed out. It's never fun to figure out how to SSH into something like Gitlab, especially if/when it's running inside docker inside a vm inside the cloud behind a VPN, and there are basic easy-to-integrate CLI commands that can be run or executed to warn admins of problems and handle problems as they arise. It doesn't need to be a lot, literally anything in this space is better than nothing.
zyphlar (bb800e5d) at 03 Jan 06:19
Update README.md
zyphlar (87e5e171) at 30 Dec 00:28
Spell out CR as County Road in Florida
zyphlar (38f6c287) at 24 Dec 19:38
Update README.md
zyphlar (f632f194) at 14 Nov 20:32
In progress
zyphlar (f82164e7) at 12 Nov 05:34
In progress
zyphlar (4cc62c49) at 26 Oct 05:44
Update data
zyphlar (e865e5d0) at 26 Oct 05:29
Update README.md
zyphlar (4e3571ad) at 26 Oct 05:23
Update README.md
zyphlar (358cb0fd) at 26 Oct 05:11
Update README.md
zyphlar (316be9ff) at 26 Oct 04:34
Update file README.md
zyphlar (c3a1f702) at 26 Oct 04:33
email screenshot
zyphlar (e124913e) at 01 Feb 23:19
Update gti