Monday, December 1, 2014

Kubernetes and Decking Container Cluster Managers

Fig. 1: Kubernetes Architecture

Kubernetes manages user-defined collections of containers called pods. Note that "pod" refers to the running container and not a static image (in Docker terminology). Besides containers, a pod can also have persistent storage attached as a volume and also define custom container health checks. Pods themselves can be organized together into "groups", a kind of "API object", which in turn can be referenced by label. There are two other main forms of API objects: replication controllers and services. The former produces a fixed number of replicas of a pod template. The latter defines internal and external ports for establishing connectivity across pods.

This is Part 2 in a 6-part series on Container Cluster Managers. See Part 1 (Top 5 Container Cluster Managers) and Part 3 (Flocker).

A pod definition is declarative in the sense that it is comprised of a desired state that Kubernetes will use auto-restart, rescheduling, and other means of self-healing to achieve behind the scenes. For example, a typical desired state is that a given container is running. The Kubernetes architecture is a master-slave one (similar to Hadoop YARN and Apache Mesos except both of those also support secondary or standby master(s)) where the master node schedules pods to worker nodes, synchronizes them, and stores persistent master configuration state. On each worker node, a "Kubelet" takes container manifests (in YAML) and ensures those containers are started and running.

Fig. 2: Hadoop YARN Architecture
Fig. 3: Apache Mesos Architecture

One neat trick that is enabled by the auto-restart feature is rolling updates (also a feature of Mesos): one can update the Docker image, kill off one container at a time in a load balanced replica set, and have auto-restart bring up updated containers. Of course, this only works if the containers do not use some shared entity inconsistently.

Decking orchestrates a Docker container cluster by taking a decking.json that must specify each container and their dependencies. Note a big difference right now is that Decking is intended only for starting up container clusters on a single machine (and Docker instance). Thus, at least for now, it is of relatively limited utility for large-scale deployments which need to orchestrate containers across multiple machines. In short, it is really just a thin layer over Docker for managing starting up containers. The scheduler is supposed to determine a schedule that satisfies these dependencies, subject to optional guidance in the decking.json file. There is nothing much in terms of error recovery and facilitation of communications between containers or groups of containers at this point.

All in all, container cluster managers seem to be at the very beginnings of their development at this point. Neither Kubernetes nor Decking are particularly robust, although both offer some convenience features.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.