-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Description
What is the problem you're trying to solve
Per microsoft/hcsshim#1375, the next minor release of hcsshim after 0.9 is going to require that the binary which uses it as a library be manifested so it can recognise the accurate version and build of Windows on which it is running.
At the moment, containerd.exe and its tests do not have a manifest, as none of the APIs used by the snapshotter, the differ, or the mount code (latter is proposed in #4419) have this requirement, but the new version of hcsshim requires version-detection support in an API used by the snapshotter and (at least) the test pkg/os TestResolvePath.
Currently the CRI integration tests are manifested in order to choose the correct container image, which leads to pulling in the whole github.com/Microsoft/hcsshim/test module just for this one thing, which is pretty expensive.
Conversely, the code which uses the changed hcsshim API that triggered this discussion * actively avoids pulling in that module and uses a different method for version identification, that has been considered but ultimately rejected by hcsshim maintainers.
* TBBle@5bfad0d shows that the upgrade to newer hcsshim removes this version-detection code from containerd, as the same adaption behaviour has been lifted into computestorage.FormatWritableLayerVhd itself. However, it broke the ABI without breaking the API, so annoyingly this is a silent breakage until you run the test. I might propose a PR to hcsshim to rename the API so that the breakage is visible at the compiler level, if no one does it first. ^_^
Describe the solution you'd like
Option 1: Manifest all the things
The most straight-forward path seems to be to manifest all executable outputs of the Windows containerd build pipeline, including tests. This would probably involve replacing both of the above examples of Windows version-management (with and without manifest) in containerd with the use of the github.com/Microsoft/hcsshim/osversion package (which requires manifesting) and integrating something like go-winres and // go:generate, e.g., as recently done in moby/moby#43431, and I guess referencing the manifest package any time we reference a package from the hcsshim module.
Option 2: Punt more Windows support into plugins
If there's really a really strong desire to not manifest the containerd.exe binaries and/or the tests, then an alternative solution would be to migrate the Windows snapshotter (which is the part most-likely to be indirectly affected by this change) to be external, and perhaps the differer and mount support must be done too. This is assuming that the external snapshotter is like the runtime shim and relies on an external executable that could be manifested. However, I'm totally unfamiliar with the external snapshotter except that it exists. (For reference, the runhcs external shim is manifested, so that's the model I'm assuming external snapshotters work on). Edit: I see the docs now, and it looks like snapshotters and differs can be external plugins. Edit again: And if I read the docs, I see only snapshot and content external plugins are actually supported.
It seems likely that if this approach is taken, then the differ and mount behaviours calling hcsshim should also be "external", possibly part of the same snapshot binary or a bundled binary, since they rely on hcsshim, and even if they don't have the problem now (I haven't confirmed this either way), there's no reason to expect they won't in future as hcsshim fulfils an aspect of its purpose, adapting API differences and fixes across HCS versions in different Windows releases. I understand the external snapshotter API does not currently support that, so based on the proposed macOS external-mounter implementation in #5935, we'd end up with a known-name binary to call on Windows. wclayer from the hcsshim repo covers about 80% of this already, by non-coincidence, and could easily be extended to cover the rest, by tracking mount source so unmount C:\target works.
This latter approach wouldn't fix the Windows version of os/pkg TestResolvePath so either that test would need to be rewritten to not rely on VHDs, or to do its VHD formatting without the benefit of hcsshim's computestorage.FormatWritableLayerVhd API. Formatting VHDs without hcsshim is "a pain", and this is the only place that containerd wants to do it, so this pain has a pretty low return-on-investment.
This approach would helpfully resolve #1681 and improve #6551 by punting the winio.EnableProcessPrivileges calls in containerd out into external binaries. So this option has a certain attractiveness for that reason, as well as allowing the Windows snapshotter/differ to be iterated on with a separate lifecycle to containerd, similarly to how the runtime-v2 shim does for runtimes.
Option 3: Kick it back to hcsshim for a non-manifested solution for hcsshim-as-a-library
A third option would be to make the case that hcsshim should provide a way to globally specify its version check results from the library user, so that containerd could override the results of github.com/Microsoft/hcsshim/osversion.Build() and then use a solution that does not require manifesting, such as the registry lookup used in os/pkg TestResolvePath, or another solution that isn't considered as "test-only".
Additional context
Note that computestorage.FormatWritableLayerVhd in hcsshim (the ABI change I hit) which now depends on the binary it is part of being manifested, is also used internally by code called from the Windows snapshotter, in the base layer Import path I believe, so simply exposing a version switch in the API is not feasible as it would end up being applied in a bunch of APIs over time.
This version-dependency may well exist in other APIs used by the snapshotter, differ, or proposed mount code; I hit a bug in containerd/continuity that blocked finishing testing with #4419, so I haven't demonstrated this either way.