capabilities is used to break the super privileges enjoyed by the root user to fine-grained rights (well just to avoid saying capabilities) so that even being a root user you are able to whatever you want unless been granted corresponding capabilities.
prepare a rootfs
We'll need to install some additional tool (libcap) to explore the capabilities, so here some instruction of how to prepare such a rootfs.
First, create a docker container with libcap installed,
docker ps -afind out the container id of the one we just run, it should be the lastest one.
Then export the rootfs to create an runc runtime bundle.
Using the default
config.jsongenerated from runc spec, you are not allowed to set the hostname, even being root.
That's because set hostname requires
CAP_SYS_ADMINcapability, even being root. We can add that capability by adding
effectivelist of the capabilities attribute of the init the process.
Run another container with the new configuration, and now you are allowed to set hostname.
Run another command in the same container, and it will able to set hostname as well, since it inherits the capability of the init process.
get the capability
get the pid of the two processes in the runtime pid namespace.
check capabilities of the running process using the pids in host namespace.
request additional capabality
The exec can require additional caps that don't exist in the
run another container
xyxy78without the CAP_SYS_ADMIN in the config.json.
Double check it really doesn't have the CAPS.
Start another process in
xyxy78but with additional CAP_SYS_ADMIN capability, using
Under the hood of
--capoption, it is to set up the capability list for the process that will be exec-ed, just as set up those things for in the config.json for the init process.
You can use capsh explore a little bit more. Run
capsh --printinside of the container.
This is the output with default config.json:
This is the output with added CAP_SYS_ADMIN capability. Compared with former one, we can see additional
cap_sys_admin+epin the "Current" and
ap_sys_adminin the "Bounding Set". The "+ep" means the preceding capabilities are in both "effective" and "permitted" list. For more information regarding the capability list, see capabilities.
We see how Linux capability is used to limit the things a process can do and thus increase the security of the container.