-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Description
What is the problem you're trying to solve
Our K8s has a mixed deployment of Kata and Runc containers, where Kata containers runs untrusted workloads and Runc for the trusted workloads. For the best performance and security reason, we use devmapper snapshotter for Kata containers.
However, with current containerd-cri implementation, snapshotter config is global, which means we cannot specify certain pods (trusted) to continue using overlayfs while only Kata containers to be on devmapper. This limitation causes the following issues:
- Slower image pulling: Devmapper backed image unpacking is 50% slower than OverlayFS, this will slow down many of the trusted pods image pulling. Device mapper pull image performance (5x) is much slower than overlayfs #6625
- Unnecessary data disk partition reservation: instead of making use of the OS boot disk the for overlayfs snapshot data, now we need to reserve a relatively large disk partition for devmapper thin pool for only a small set of untrusted images, most of the disk spaces allocation is a waste of cost/space.
- Slower snapshot prepare when SandboxChange (usually after a node reboot), which causes slower container start up. mkfs.ext4 is slow in devmapper #6119
- Device mapper is generally harder to operate (device thin pool data allocation is not released until the device is remove or run
fstrim, if a container keeps writing + deleting to rootfs it will eventullay run out-of-space) and less mature and we would like to adopt in minimal scope.
Describe the solution you'd like
Pod annotation can carry snapshotter type to tell CRI. It seems to be a containerd CRI limitation, for example, we can use ctr run --snapshotter devmapper/overlayfs. Similar to runtimeClass, it would be great to support dynamic snapshotter override with Pod annotations.
Will Containerd consider to accept such extended feature or something along the line?
Additional context
No response