Docker RunC Container Escape

Docker containers allow developers to package their application with all of their dependencies and components in a single place. That way, their applications will run quickly and reliably in many different computing environments. On the surface, Docker containers can seem safe, as they isolate an application and its dependencies into a self-contained unit. The truth is that nothing is entirely secure. That goes for Docker as well.

RunC is an open source command line utility designed to spawn and run containers and, at the moment, it is used as the default runtime for containers with Docker, containerd, Podman and CRI-O. The RunC container breakout, identified as CVE-2019-5736, is a vulnerability that is exploitable in all versions of Docker earlier than 18.09.2. By exploiting this, a malicious actor can run commands on the host machine as a privileged user.

The exploit

There are actually two ways to exploit this vulnerability. One is to craft a rogue image that will make the exploit fire when it is executed by an unsuspecting user or the attacker himself. The other is to compromise a running container by using a malicious binary in it.

For the purposes of this post, we will focus on the second exploit type. If after reading this post you want to learn more about this particular vulnerability, we recommend this blog post authored by the the people who discovered it.

How Does it Work?

This exploit works by overwriting and executing the host system’s RunC binary from within a container.

An attacker needs root access in the container to start a malicious binary that listens to the shell connections. The exploit, which will allow code execution as root on the host, will trigger when someone (attacker or victim) uses docker exec to get a shell in the compromised container.

As an example, if the target binary was /bin/bash, then this could be replaced with an executable script specifying the interpreter path #!/proc/self/exe. When /bin/bash is executed inside the container, instead the target of /proc/self/exe will be executed - which will point to the RunC binary on the host.

If you’re interested in testing this exploit yourself, then there are several working proofs of concepts out there to choose from, such as this one or this one.

The fix

As stated in the aforementioned blog post, the following fix can be applied to mitigate this vulnerability:

Create a memfd (a special file that exists only in memory).
Copy the original RunC binary to this fd.
Before entering namespaces, re-exec RunC from this fd.

This fix guarantees that if the attacker overwrites the binary pointed to by /proc/self/exe, then it will not cause any damage to the host because it’s a copy of the host binary, stored entirely in memory (tmpfs).

Mitigation

The blog also contains several mitigation possibilities when using an unpatched runC:

Use Docker containers with SELinux enabled (–selinux-enabled). This prevents processes inside the container from overwriting the host docker-runc binary.
Use a read-only file system on the host, at least for storing the docker-runc binary.
Use a low privileged user inside the container or a new user namespace with uid 0 mapped to that user (then that user should not have write access to the RunC binary on the host).

Conclusion

Docker is a great tool for running containers. However, keep in mind that vulnerabilities will inevitably arise with Docker. That’s why it’s important to keep your software up-to-date. If you are using Docker version 18.09.2. and up, then this vulnerability shouldn’t pose a threat to you.