Published: 10/11/2024
5 minute read
What is new in containerd 2.0?
Containerd recently released its first major update since its 1.0 release form over 7 years ago. It provides some exciting new features, improves compatibility, gets rid of some deprecated functions and may even improve performance.
This post aims to give a high level overview of the changes made in containerd and takes a closer look into the user-namespaces functionality.
ℹ️ Containerd is a Container-Runtime-Interface (CRI) compliant implementation that can manage the entire lifecycle of an container. It is used in Docker and most Kubernetes flavors (including AKS/EKS/GKE) to manage all container processes.
How do I upgrade
The containerd project is committed to keep the upgrade path as simple as possible. Most settings should work the same as containerd v1.7x. When you use the deprecated aufs snapshotter (most likely you are not) you have to switch to a new snapshotter. Another change is that you have to explcitly re-enable the image schema V1 if you still depend on those.
Beside that there should be little changes that require attention. To upgrade to the newer config version v3 you can just run containerd config migrate.
New functions
The GitHub Release from containerd 2.0 gives a first overview of the changes, but does not explain the implications of each change. I piked a few highlight and will cover them in more detail:
User-Namespaces
This is a big one for me and will increase security of containerized workloads by a lot!
Why is it important?
If you run a container today you can set the UID, the user UserID as which the main container process runs. Every single security guide regarding containers recommends to run container processes as none-root and as a unprivileged user, which IDs start at the 1000 range.
For most applications this is not an issue, but there are certain processes that require root. If processes need to install packages they require root, CI-Jobs often require root and it is really hard to run an ssh server as none-root. Even one of the most used containers, nginx, start as root and then drop their permissions to a none privileged user. The problem with root is that if an attacker manages to escape the container sandbox he is automatically root on the host system since UserIDs in a container map one to one to the ones on the host.
User-Namespaces changes this. If you ran a container with the user-namespaces feature enabled a container can run as root, but has a different, unprivileged UserID on the host. This is a great feature that drastically improves security without a performance penalty.
The 2.0 release of containerd supports user-namespaces by default but you may still have to enable it in your kernel via user.max_user_namespaces=1048576 (Suggested values from CoreOS for CRI-O).
While the feature is now shipped with containerd, it is still in beta for Kubernetes. Your can enable it by setting the featureGate UserNamespacesSupport to true on the api-server.
To start a pod which uses user-namespaces set the hostUsers parameter to false. Thats all you need to do. This can also be enforced via admission controllers.
apiVersion: v1
kind: Pod
metadata:
labels:
run: user-namespaces-demo
name: user-namespaces-demo
spec:
# Enables user-namespaces
hostUsers: false
containers:
- image: nginx
name: user-namespaces-demo
resources: {}Intel ISA-L’s igzip
This is a nice addition which may increase your container startup times by a few hundred milliseconds. But how?
Container images are just a bunch of gziped archives layered ontop of each other. A manifest specifies which archives belong to which image. When an image is pulled it gets transferred via HTTP over the network and has to be extracted and decompressed. Most of the time this is done via gzip.
But gzip is single-threaded and does not use the full potential of modern processor. To overcome this there is also a multithreaded implementation of gzip called pigz. While pigz is faster then the default variant, igzip is even more performant. Acording to benchmark by simonis igzip is twice as fas as gzip.
If you want to profit from this improvement just install igzip (debian package is called isal), containerd automatically check which version is installed on the system and picks the most preformat one.
Container Checkpoints
This is more of a debug function, but with containerd 2.0 you now can freeze an entire container at is current state, including all memory and running processes, and export it. The container now can be restored or further inspected. You can read more about this on the criu-wiki.
Notable mentions
- The config format version changed to v3. You can migrate it with
containerd config migrate - Containerd now allows container processes to bind to privileged ports by default.
- Enables the Container-Device-Interface (CDI) by default
- Deprecated
aufssnapshotter was deleted