Understand Container: OCI Specification

OCI is the industry collaborated effort to define an open containers specifications regarding container format and runtime - that is the official tone and is true. The history of how it comes to where it stands today from the initial disagreement is a very interesting story or case study regarding open source business model and competition.

But past is past, nowadays, OCI is non-argumentable THE container standard, IMO, as we'll see later in the article it is adopted by most of the mainstream container implementation, including docker, and container orchestration system, such as kubernetes, Plus, it is particularly helpful to anyone trying to understand how the containers works internally. Open source code are awesome but it is double awesome with high quality documentation!

Overview

OCI has two specs, a Image spec and a Runtime spec. Below is the overview of what they cover and how they interact.
Image Runtime | config | runtime config layers | rootfs | | | delete | | | | | unpack | | create | start/stop/exec Image (spec) ----|-> Bundle ----------> container -------> process | | | hooks |

Image (Spec)

Image spec defines the archive format of container images, which will be unpacked to the runtime bundle from which we can run a container.
To the top level, it is just a tar ball, after untar-ed, it has a layout as below.
├── blobs │   └── sha256 │   ├── 4297f01aae8e36da1ec85e36a3cc5a4b11aa34bcaa1d88cc9ca09469826cb2bf (image.manifest) │   └── 7ea0496f252ea46535ea6932dc460cb7d82bfc86875d9d2586b6afa1e8807ad0 (image.config) ├── index.json └── oci-layout
The layout isn't that useful without a specification of what that stuff is and how they are related (referenced).

We can ignore the file oci-layout for simplicity. index.json is the entry point, it contains primary a manifest. which listed all the "resources" used by a single container image. Similar to Manifest.xml file for an Android apk.

The manifest contains primarily the config and the layers.

The config contains notably 1) configurations if the image, which can and will be converted to the runtime config file of the runtime bundle, and 2) the layers, which makes up the root file system of the runtime bundle, and 3) some metadata regarding the image history.

layers are what makes up the final rootfs. The first layer is the base, all the other layers contain only the changes to its base.
Put that into a diagram, roughly this.
index.json -----> manifest -> Config | | ref | | |-------- Layers --> [ Base, upperlayer1, upperlayer2,...]

More on Layers

A config file is just a json and is easy. So the interesting part is how to represent a file system as a layer, and how to union all the layers, as we know the layers are diffs.
  • How to represent a layer?
  • For the base layer, tar all the content;
  • For non base layers, tar the changeset compared with its base.
    Hence, first detect the change, form a changeset; and then tar the changeset, as the representation of this layer.
  • How to union all the layers?
Apply all the changesets on top of the base layer. This will give you the rootfs system.

Runtime Spec

Once the Image is unpacked to a runtime bundle on the disk file system, runtime spec will take care from there. Roughly, the job is to create a container and run the (processes in the) containers.

Container lifecycle

A container has a lifecycle, at the essence, as you can imagine, it can be model as following state diagram.

You can throw in a few other actions and states, such as pause and paused, but those are the fundamental ones.
create +---------+ start +---------+ +---------> | created| | started | | | +----------> | | +---------+ +----+----+ | v stop +---------+ +---------+ | deleted | | stopped | | | ------------+ | | +---------+ +---------+ delete
note: Somehow, the left arrow (<) will sabotage the whole diagram using my current blogspot template. I just omit it until I find a time to fix it.

Image, Container, and Processes

Containers are created from (container) Image, you can create more than one containers from a single Image, and you can repack the containers, possible with changes to base image, back to a new Image.

After you get the containers, you can run process inside of that container, without all the nice things about a container, most notably, self-contained - don't depend on the host libraries.
images container processes + + | | | | create| | +------------+ | +---------+ start | +---------+ |runtime +---------+ | created| | | started | |Bundle | | | | +----------> | | | | | +---------+ | +----+----+ +------------+ | | | | | v stop | | | +---------+ | +---------+ | | deleted | | | stopped | | | | ------------+ | | | +---------+ | +---------+ | delete | | | | |

Implementations and Ecosystems

runC is the reference implementation of the oci runtime specification. The diagram below shows its relationship with other projects, mostly with docker origin, Each entity below follows the format of org/project.

+---------------------+ | | | dockerInc/docker | | | +--------+------------+ | use +---------v-------------+ | | | moby/moby | | | +---------+-------------+ | use +-------------------+ +----------v-------------+ | | | | | oci/runtime-spec | | containerd/containerd | | | | | +---------+---------+ +----------+-------------+ ^ | | | use |impl v | +----------------------+ +---------------------+ | | | | | +---------------------|oci/runc +-----> |oic/runc/libcontainer| | | | | +----------------------+ +---------------------+
To make things looks even more crowded/flourished, throw in some kubernetes things.

CRI is the Container Runtime Interface defined by kubernetes to allows for pluggable container runtime for k8s. There are currently several implementations, among them are cri-containerd and cri-o, both are actually end up use oci/runc.
+-------------------------------| ---------------------------------------+ | | --------------+ | | k8s/CRI | | | | (container runtime interface) | | | +-------------------------------+ impl | impl | | | | | +-----+--------+ +--------+------+ |cri-containerd| |cri-o | +----------| | | | | +--------------+ +-----+---------+ k8s | | +-------------------+ +----------v-------------+ | container | | | | | | oci/runtime-spec | | containerd/containerd | | | | | | | +---------+---------+ +----------+-------------+ | ^ | |use | | use +--------------------------+ |impl v | | +---------------------++ +---------------------+ | | | | | +---------------------|oci/runc +-----> |oic/runc/libcontainer| | | | | +----------------------+ +---------------------+

That's it for today.