Revert commit that changed LimitNOFILE to infinity to avoid regressions#7566
Revert commit that changed LimitNOFILE to infinity to avoid regressions#7566rayburgemeestre wants to merge 1 commit intocontainerd:mainfrom
Conversation
|
Hi @rayburgemeestre. Thanks for your PR. I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
…md_updates" This reverts commit 6c74c39, reversing changes made to 09814d4. Signed-off-by: Ray Burgemeestre <[email protected]>
efb4b88 to
d33b2a6
Compare
|
I wonder if this needs some discussion (I'm a bit behind on the linked discussions, and probably should post there as well). So the TL;DR is that; The intent of the
The culprit in the linked issues with MySQL (#3201, #6707, docker-library/mysql#579) is that these containers are started without limits set, so they run with "unconstraint" limits. It looks like MySQL as a default optimizes for performance (?) and consumes "what it can use"; due to the increased limits, this means it consumes way more than before, which is causing the issue. So the "correct" fix for these is to make sure containers are started with the intended limits on resources (cpu, memory, pid limit, ulimits). Those limits vary depending the use-case; some instances may need (expect to need) more resources than others, but in all cases it's good practice to set limits to the expected requirements for it to consume. So, the question is; what should be done?
|
|
Oh this reminds me #6541. Writing a systemd unit file for multiple systemd versions is not the thing. We do need to pick the systemd version first... |
|
I'm curious to know how many people are consuming the unit file from the containerd repo. Typically, distributions that package containerd would be responsible for shipping their own unit file and can make adjustments to it based on the kernel and systemd version available in that distribution. From containerd's perspective, we can't predict what every installed system will have with respect to a kernel and systemd version so we'll need to pick some behavior in our unit file, but that won't necessarily be appropriate for every system. |
|
Hi all,
Took a bit of googling and I found this SO thread and it fixed both issues https://stackoverflow.com/questions/73185002/yum-update-stucks-inside-docker This is whats currently distributed on the fedora 35 machine in /usr/lib/systemd/system/containerd.service and my current version Not sure if this is the place to put it as this might be more of a fedora packaging issue as you guys are discussing but wanted to make it known |
|
I believe the commit should be reverted, as it is a regression in some circumstances. The decision to make the limit infinite (unopiniated) is fine, but it should be communicated as a breaking change in a later release. As far as I know, this change was not made intending to be a breaking change. |
|
Another datapoint: |
Doesn't look like a plausible explanation to me. When you run MySQL without docker, it doesn't have any limits applied and it doesn't try to consume more RAM than is available. When I run MySQL in docker now, it tries to allocate 16.5 Gb of RAM immediately on start, although my laptop has only 16 Gb. There must be something else that causes MySQL to consume RAM in the latest docker so greedily. |
It will have limits applied; Limits on the host: Limits in container: |
Apparently with some recent releases of some OSes where new containerd has been released, default docker community edition causes Airlfow to immediately consume all memory. This happens in Breeze and Docker Compose at least. There is a workaround described in: ttps://github.com/moby/moby/issues/43361#issuecomment-1227617516 to use Docker Desktop instead. The issue is tracked in containerd in this issue - proposing to revert the change (as it impacts other applications run in docker, not only Airlfow): containerd/containerd#7566
Apparently with some recent releases of some OSes where new containerd has been released, default docker community edition causes Airlfow to immediately consume all memory. This happens in Breeze and Docker Compose at least. There is a workaround described in: ttps://github.com/moby/moby/issues/43361#issuecomment-1227617516 to use Docker Desktop instead. The issue is tracked in containerd in this issue - proposing to revert the change (as it impacts other applications run in docker, not only Airlfow): containerd/containerd#7566
|
Just an FYI: This has bitten me in the behind 2 times the last 2 weeks: once for slapd (openldap) and once for airflow (python). |
|
If I may, I would recommend adding a note somewhere to stop this cycle
|
|
This took a bit longer than expected to put together 😅
Personal experience with
|
To
|
History of decisions for
|
|
Almost there, I think I've connected all the dots in this post 👍 March 2019 - Proposal that
|
Summary
Reasons to change away from
|
containerd recently changed the default LimitNOFILE from 1048576 to infinity. This breaks various applications in containers such as cups, which cannot start and/or print with the infinite ulimit. Issue: moby/moby#45204 PR: containerd/containerd#7566 Signed-off-by: Christian Stewart <[email protected]>
containerd recently changed the default LimitNOFILE from 1048576 to infinity. This breaks various applications in containers such as cups, which cannot start and/or print with the infinite ulimit. Issue: moby/moby#45204 PR: containerd/containerd#7566 Signed-off-by: Christian Stewart <[email protected]>
containerd recently changed the default LimitNOFILE from 1048576 to infinity. This breaks various applications in containers such as cups, which cannot start and/or print with the infinite ulimit. Issue: moby/moby#45204 PR: containerd/containerd#7566 Signed-off-by: Christian Stewart <[email protected]>
containerd recently changed the default LimitNOFILE from 1048576 to infinity. This breaks various applications in containers such as cups, which cannot start and/or print with the infinite ulimit. Issue: moby/moby#45204 PR: containerd/containerd#7566 Signed-off-by: Christian Stewart <[email protected]>
containerd recently changed the default LimitNOFILE from 1048576 to infinity. This breaks various applications in containers such as cups, which cannot start and/or print with the infinite ulimit. Issue: moby/moby#45204 PR: containerd/containerd#7566 Signed-off-by: Christian Stewart <[email protected]>
containerd recently changed the default LimitNOFILE from 1048576 to infinity. This breaks various applications in containers such as cups, which cannot start and/or print with the infinite ulimit. Issue: moby/moby#45204 PR: containerd/containerd#7566 Signed-off-by: Christian Stewart <[email protected]>
containerd recently changed the default LimitNOFILE from 1048576 to infinity. This breaks various applications in containers such as cups, which cannot start and/or print with the infinite ulimit. Issue: moby/moby#45204 PR: containerd/containerd#7566 Signed-off-by: Christian Stewart <[email protected]>
containerd recently changed the default LimitNOFILE from 1048576 to infinity. This breaks various applications in containers such as cups, which cannot start and/or print with the infinite ulimit. Issue: moby/moby#45204 PR: containerd/containerd#7566 Signed-off-by: Christian Stewart <[email protected]>
polarathene
left a comment
There was a problem hiding this comment.
@rayburgemeestre if you're seeing this could you please apply the following change? Thanks! 😁
| LimitNPROC=infinity | ||
| LimitCORE=infinity | ||
| LimitNOFILE=infinity | ||
| LimitNOFILE=1048576 |
There was a problem hiding this comment.
| LimitNOFILE=1048576 |
For the equivalent docker.service config, the decision was to drop this line, and it'll be part of the v25 release of moby.
Equivalent change should hopefully follow soon here 🙏
There was a problem hiding this comment.
Sorry for being so slow. I see in the meantime you created a separate PR already (#8924).
Looks like it's about to be merged 🤞
There was a problem hiding this comment.
No worries, I know how it is :)
Thanks for kickstarting the process with the original attempt here! ❤️
containerd recently changed the default LimitNOFILE from 1048576 to infinity. This breaks various applications in containers such as cups, which cannot start and/or print with the infinite ulimit. Issue: moby/moby#45204 PR: containerd/containerd#7566 Signed-off-by: Christian Stewart <[email protected]>
containerd recently changed the default LimitNOFILE from 1048576 to infinity. This breaks various applications in containers such as cups, which cannot start and/or print with the infinite ulimit. Issue: moby/moby#45204 PR: containerd/containerd#7566 Signed-off-by: Christian Stewart <[email protected]>
containerd recently changed the default LimitNOFILE from 1048576 to infinity. This breaks various applications in containers such as cups, which cannot start and/or print with the infinite ulimit. Issue: moby/moby#45204 PR: containerd/containerd#7566 Signed-off-by: Christian Stewart <[email protected]>
containerd recently changed the default LimitNOFILE from 1048576 to infinity. This breaks various applications in containers such as cups, which cannot start and/or print with the infinite ulimit. Issue: moby/moby#45204 PR: containerd/containerd#7566 Signed-off-by: Christian Stewart <[email protected]>
containerd recently changed the default LimitNOFILE from 1048576 to infinity. This breaks various applications in containers such as cups, which cannot start and/or print with the infinite ulimit. Issue: moby/moby#45204 PR: containerd/containerd#7566 Signed-off-by: Christian Stewart <[email protected]>
containerd recently changed the default LimitNOFILE from 1048576 to infinity. This breaks various applications in containers such as cups, which cannot start and/or print with the infinite ulimit. Issue: moby/moby#45204 PR: containerd/containerd#7566 Signed-off-by: Christian Stewart <[email protected]>
containerd recently changed the default LimitNOFILE from 1048576 to infinity. This breaks various applications in containers such as cups, which cannot start and/or print with the infinite ulimit. Issue: moby/moby#45204 PR: containerd/containerd#7566 Signed-off-by: Christian Stewart <[email protected]>
|
FWIW, in Home Assistant OS we have been experience issues with the new defaults as well, I've documented the problems in home-assistant/operating-system#2438. For now we fixed/worked around the challenges with the new default in the affected containerized user space applications. That said, reading the conclusion of this research I see that we probably rather should have reverted that change instead 😰 In any case, we'd like to adopt whatever the upstream projects conclude on. Which brings me to...
It seems that the sibling PR got merged meanwhile. Is there anything holding back containerd from merging this change. |
This PR has been superseded by mine, where I asked the same question about a week ago. No response yet. Looks like the maintainers are onboard with the change now, and all feedback / concerns has been addressed. They're just very busy and I imagine drowning in notifications 😅 I'll try reach out more directly later this month if it goes back to radio silence. I've currently directly helped 3 projects It'd be great to not need to push such solutions through all affected projects though 😝 I've provided extensive evidence for why |
Hi there,
This PR is reverting the change introduced by the following PR: #4475
In our case we experienced an issue with it with RHEL 9 and Rocky 9. The system would OOMKill processes running in containers (spawned via Kubernetes, using containerd)
With the limit set to INFINITY the system would become either unresponsive if unlucky, or the process would be killed, similar to below.
This is a very lightweight mysql server, on a very beefy node getting killed right away. This also happened with java stuff, uwsgi, and other processes we have automated tests with.
We tried with Kubernetes 1.24, and 1.21, with cgroups v2 and v1, later tried kubernetes on docker, where it worked. Which led us to the discovery that docker is using
LimitNOFILE=1048576and containerd is usingLimitNOFILE=infinityby default.Others have been running into this as well, for example:
We can work around it, but I believe simply reverting the change will make life easier for others, and maybe therefore worth doing? It would have saved me personally a couple of days troubleshooting.