Skip to main content

Understand Container: OCI Specification

OCI is the industry collaborated effort to define an open containers specifications regarding container format and runtime - that is the official tone and is true. The history of how it comes to where it stands today from the initial disagreement is a very interesting story or case study regarding open source business model and competition.

But past is past, nowadays, OCI is non-argumentable THE container standard, IMO, as we'll see later in the article it is adopted by most of the mainstream container implementation, including docker, and container orchestration system, such as kubernetes, Plus, it is particularly helpful to anyone trying to understand how the containers works internally. Open source code are awesome but it is double awesome with high quality documentation!

Overview

OCI has two specs, a Image spec and a Runtime spec. Below is the overview of what they cover and how they interact.
Image Runtime | config | runtime config layers | rootfs | | | delete | | | | | unpack | | create | start/stop/exec Image (spec) ----|-> Bundle ----------> container -------> process | | | hooks |

Image (Spec)

Image spec defines the archive format of container images, which will be unpacked to the runtime bundle from which we can run a container.
To the top level, it is just a tar ball, after untar-ed, it has a layout as below.
├── blobs │   └── sha256 │   ├── 4297f01aae8e36da1ec85e36a3cc5a4b11aa34bcaa1d88cc9ca09469826cb2bf (image.manifest) │   └── 7ea0496f252ea46535ea6932dc460cb7d82bfc86875d9d2586b6afa1e8807ad0 (image.config) ├── index.json └── oci-layout
The layout isn't that useful without a specification of what that stuff is and how they are related (referenced).

We can ignore the file oci-layout for simplicity. index.json is the entry point, it contains primary a manifest. which listed all the "resources" used by a single container image. Similar to Manifest.xml file for an Android apk.

The manifest contains primarily the config and the layers.

The config contains notably 1) configurations if the image, which can and will be converted to the runtime config file of the runtime bundle, and 2) the layers, which makes up the root file system of the runtime bundle, and 3) some metadata regarding the image history.

layers are what makes up the final rootfs. The first layer is the base, all the other layers contain only the changes to its base.
Put that into a diagram, roughly this.
index.json -----> manifest -> Config | | ref | | |-------- Layers --> [ Base, upperlayer1, upperlayer2,...]

More on Layers

A config file is just a json and is easy. So the interesting part is how to represent a file system as a layer, and how to union all the layers, as we know the layers are diffs.
  • How to represent a layer?
  • For the base layer, tar all the content;
  • For non base layers, tar the changeset compared with its base.
    Hence, first detect the change, form a changeset; and then tar the changeset, as the representation of this layer.
  • How to union all the layers?
Apply all the changesets on top of the base layer. This will give you the rootfs system.

Runtime Spec

Once the Image is unpacked to a runtime bundle on the disk file system, runtime spec will take care from there. Roughly, the job is to create a container and run the (processes in the) containers.

Container lifecycle

A container has a lifecycle, at the essence, as you can imagine, it can be model as following state diagram.

You can throw in a few other actions and states, such as pause and paused, but those are the fundamental ones.
create +---------+ start +---------+ +---------> | created| | started | | | +----------> | | +---------+ +----+----+ | v stop +---------+ +---------+ | deleted | | stopped | | | ------------+ | | +---------+ +---------+ delete
note: Somehow, the left arrow (<) will sabotage the whole diagram using my current blogspot template. I just omit it until I find a time to fix it.

Image, Container, and Processes

Containers are created from (container) Image, you can create more than one containers from a single Image, and you can repack the containers, possible with changes to base image, back to a new Image.

After you get the containers, you can run process inside of that container, without all the nice things about a container, most notably, self-contained - don't depend on the host libraries.
images container processes + + | | | | create| | +------------+ | +---------+ start | +---------+ |runtime +---------+ | created| | | started | |Bundle | | | | +----------> | | | | | +---------+ | +----+----+ +------------+ | | | | | v stop | | | +---------+ | +---------+ | | deleted | | | stopped | | | | ------------+ | | | +---------+ | +---------+ | delete | | | | |

Implementations and Ecosystems

runC is the reference implementation of the oci runtime specification. The diagram below shows its relationship with other projects, mostly with docker origin, Each entity below follows the format of org/project.

+---------------------+ | | | dockerInc/docker | | | +--------+------------+ | use +---------v-------------+ | | | moby/moby | | | +---------+-------------+ | use +-------------------+ +----------v-------------+ | | | | | oci/runtime-spec | | containerd/containerd | | | | | +---------+---------+ +----------+-------------+ ^ | | | use |impl v | +----------------------+ +---------------------+ | | | | | +---------------------|oci/runc +-----> |oic/runc/libcontainer| | | | | +----------------------+ +---------------------+
To make things looks even more crowded/flourished, throw in some kubernetes things.

CRI is the Container Runtime Interface defined by kubernetes to allows for pluggable container runtime for k8s. There are currently several implementations, among them are cri-containerd and cri-o, both are actually end up use oci/runc.
+-------------------------------| ---------------------------------------+ | | --------------+ | | k8s/CRI | | | | (container runtime interface) | | | +-------------------------------+ impl | impl | | | | | +-----+--------+ +--------+------+ |cri-containerd| |cri-o | +----------| | | | | +--------------+ +-----+---------+ k8s | | +-------------------+ +----------v-------------+ | container | | | | | | oci/runtime-spec | | containerd/containerd | | | | | | | +---------+---------+ +----------+-------------+ | ^ | |use | | use +--------------------------+ |impl v | | +---------------------++ +---------------------+ | | | | | +---------------------|oci/runc +-----> |oic/runc/libcontainer| | | | | +----------------------+ +---------------------+

That's it for today.

Popular posts from this blog

Understand Container - Index Page

This is an index page to a series of 8 articles on container implementation. OCI Specification Linux Namespaces Linux Cgroup Linux Capability Mount and Jail User and Root Network and Hook Network and CNI
Update:
This page has a very good page view after being created. Then I was thinking if anyone would be interested in a more polished, extended, and easier to read version.
So I started a book called "understand container". Let me know if you will be interested in the work by subscribing here and I'll send the first draft version which will include all the 8 articles here. The free subscription will end at 31th, Oct, 2018.

* Remember to click "Share email with author (optional)", so that I can send the book to your email directly. 

Cheers,


Android Camera2 API Explained

Compared with the old camera API, the Camera2 API introduced in the L is a lot more complex: more than ten classes are involved, calls (almost always) are asynchronized, plus lots of capture controls and meta data that you feel confused about.