Monday, December 29, 2014

Understanding Cloud Host Pricing, Part 2

In the past few years, there has been a movement to standardize cloud compute resource measurements in order to make way for public trading of compute resources. The idea is simple, but execution may be complicated: each company can run something like OpenStack and rent off underutilized compute resources and these resources can be further trading on public exchanges to enable companies to hedge for price spikes. Along these lines, Amazon was quite early in introducing the Reserved Instance Marketplace. A public trading of standardized compute units will enable smaller organizations to monetize underutilized assets. This model is not without its challenges. Compute resources have many aspects that distinguish them. Performance may vary dramatically. In this post, I investigate some of the smaller cloud hosts and their prices.

ProviderMinimum Unit ($/hr)Memory (GB)Instance Storage (GB)Persistent Block Storage ($/GB/mo)
HP Helion0.031100.10
IBM SoftLayer0.041250.10
Oracle Cloud1.8
CloudSigma0.0319110.14 SSD
DreamHost DreamCompute0.0264225-
Internap0.04120 SSD0.30

Saturday, December 27, 2014

Top 5 Gotchas When Running Docker on a Mac

Running Docker on Mac is meant to be a convenience but the fact that Docker on Mac is a 2nd class citizen shows up every now and then. Since Docker is based on Linux cgroups, it cannot and does not run natively on MacOS X. Instead, Docker runs on Macs by using boot2docker, a shim that boots up a whole VirtualBox VM on which one will actually run Docker. Running Docker inside of a VM on Macs complicates things quite a bit.

Friday, December 26, 2014

Understanding Cloud Host Pricing, Part 1

The pricing schemes of the top cloud infrastructure-as-a-service (IaaS) providers are rather complicated. They cannot be compared directly since their performance characteristics vary. Moreover, the differences in costs of instances, storage, and bandwidth may offset each other. For example, one provider A's storage costs may be greater than provider B but B's instance costs may be greater. Some providers bill for instances by the minute (Azure and Google Computer Engine) whereas others bill by hour (rounded up), such as AWS. Some providers charge for storage by the TB (Azure) whereas others charge by GB/month or sometimes GB/hour (Rackspace). Consequently, what constitutes to the best deal in terms of cloud hosting depends on your specific workloads and storage needs. In this series, we will investigate the various aspects of cloud host pricing from the major providers: Amazon AWS, Microsoft Azure, Google Compute Engine, DigitalOcean, and Rackspace.

Wednesday, December 17, 2014

Retry Pattern

A common design pattern in fault-tolerant distributed systems is the retry pattern. A given operation may experience a variety of failures:

  1. rare transient failures

    (e.g., corrupted packet) can be recovered from immediately and thus should retry immediately
  2. common transient failures

    (e.g., network busy) can retry after waiting for a period of time (possibly with exponential backoff
  3. permanent failure

    should not retry, bail out and clean up
Of course, the final case is that the operation succeeds and the function must do some work to address that. This is an interesting design pattern not only for distributed systems but also fault-tolerant systems in general. For example, high-performance JavaScript engines have parsers and tokenizers that must be robust to various failures. In fact, it is one example where large systems have used multiple exit points, more complicated control flow for which C-based programs may use gotos.

Wednesday, December 10, 2014

Docker Image for SML/NJ

The various Linux distros package repos carry very outdated versions of the SML/NJ compiler. This Docker image builds the latest official SML/NJ release.

Tuesday, December 9, 2014

Apache Mesos and Hadoop YARN Scheduling

Mesos and YARN are two powerful cluster managers that can play host to a variety of distributed programming frameworks (Hadoop Map-Reduce, Dryad, Spark, and Storm) as well as multiple instance of the same framework (e.g., different versions of Hadoop). Both are concerned about optimizing utilization of cluster resources especially in terms of data locality of data distributed around the cluster. Google's paper on Omega, their own cluster scheduling system, dubs Mesos a two-level scheduler, which provide some flexibility by having a single resource manager offer resources to multiple parallel, independent schedulers. YARN is considered a monolithic scheduler since independent Application Masters are only responsible for job management and not scheduling. Scheduling is the essence of efficient Big Data processing. However, where do these two systems differ?

Monday, December 8, 2014

Alternatives to Docker: LXD and Rocket

Two recently announced alternatives to the Docker Linux container runtime LXD and Rocket aim to offer some interesting value propositions. Before I get to the details, let's first identify what use cases and aspects of the Docker container runtime of interest here. Docker identifies a few major categories of use cases: continuous integration, continuous delivery, scaling distributed applications, and Platform-as-a-Service. The former two are DevOps use cases. At this point, Docker pretty much has a lock on DevOps use cases. Moreover, neither of the would-be competitors truly target DevOps. It becomes obvious when you consider that LXD was intended to run on OpenStack Server environments and Rocket on CoreOS/fleet (though it is not necessarily tied to CoreOS). Docker runs on workstation environments and even on top of VirtualBox via boot2docker to support DevOps functionality on MacOS X. The more competitive aspect is the cloud infrastructure one. Here, Docker is competing with a wider range of technologies to support scaling on the cloud and providing PaaS functionality. This is the market where LXD and Rocket would operate. This is also the area where hypervisors have reigned.

Sunday, December 7, 2014

How to search for great programmers, Part 2

Aline Lerner of Trialpay recently posted statistics on an experiment about resume review. The conclusion was that recruiters, engineers, and just about everyone score resumes all over the place and therefore resumes have weak signal value. The claim was that the strongest signal in the resume was the number of typos. Although the study seems extensive, I think there are a number of weaknesses in the experimental design. One weakness was acknowledged: the ground truth is the author's own subjective evaluation of the candidates. Another weakness was how the survey questions were somewhat misleading in the first place. The questionnaire asks "would you interview this candidate" and yet this was compared with the ground truth of "will the candidate perform well on the job or technical interview". As I alluded to in an earlier post in this series, the role of a resume is to help filter for red flags and to guide the formal interview, not to determine whether a candidate is a star performer by itself. The fact of the matter is, a resume is a self-reported synopsis of a candidate's track record. To evaluate a candidate, I would think track records are important, as is potential.

Saturday, December 6, 2014

What is really interesting about Quantitative Behavioral Finance

photo by Stuck in Customs via PhotoRee

Quantitative behavioral finance has not been with out for a very long time. As a relatively recent development and area of discourse, it has only begun to gain a following. One very interesting aspect of this field is the use of experimental asset markets. These studies are based on experiments conducted on a small group of people (but with real money hence a real market) to examine where rational expectations and classical game theory fails to explain human behavior. This is basically small-scale version of the prediction markets such as the Iowa Electronic Markets, Intrade, and Betfair. However, unlike prediction markets where the ultimate objective is to predict an external event, experimental asset markets are more interested in the mechanics and patterns of the market itself. Caginalp, Vernon Smith (Nobel Memorial Economics Prize recipient of 2002), and David Porter have a couple of papers on experiments in this mode. Both experiments examine how financial bubbles can happen. Some of the take-aways are that excess cash and information asymmetry due to lack of an open book may exacerbate bubbles. I think the matter of information asymmetry is very salient. Despite all the effort and money invested in improving information infrastructure by banks and hedge funds, ultimately information is distributed non-uniformly to market participants. This is most obvious in the case of the retail investor who neither has the time nor resources to obtain and analyze all the market information.

Friday, December 5, 2014

Container Virtualization Options

photo by sioda via PhotoRee

Looks like the container virtualization space is becoming a little more interesting this week. Previously, Docker was the only more or complete standard container implementation (with definition of image, image creation, and container start/stop management). There was Canonical's LXD, it didn't seem to be garnering nearly as much attention and support since it was only announced a month ago. However, with the Docker and CoreOS organizations starting to encroach on each other's territory, the CoreOS community has released an early version of their own container runtime, Rocket. On the balance, Docker has moved into the container cluster orchestration and management space with Docker Swarm and Docker Compose, the latter being still in the design stage.

Containers versus Virtual Machines

Containers-based systems (e.g., Docker, LXC, cgroups) and virtual machines (VMWare, Xen) both seek to bring the benefits of virtualization to the data center and developer workflow. They have considerable overlap in benefits. Although both do some kind of virtualization to enable better utilization of physical hardware, there are also some key differences. Containers do virtualization at the OS kernel-level. Hence isolation is limited to what the kernel can enforce. Containers do a lot of sharing of layers of file systems, courtesy of AuFS, which potentially makes better use of disk space and image space than virtual machines which commit the entire contents of a VM's disk to the disk image.

Thursday, December 4, 2014

Supervisor trees

In my past few posts, I have focused on fault tolerant distributed systems as implemented through cluster managers. Apache Mesos, Kubernetes, and many others all attempt to support fault tolerance by auto-restarting and other self-healing techniques at the cluster manager level. As such, they rightly claim that they are the new operating systems of the cloud. It turns out, however, cluster managers certainly do not have a monopoly on fault tolerance features. Long before Mesos, Kubernetes, and possibly even University of Wisconsin's Condor, a distributed processing system with considerable more pedigree, Erlang had supervisor trees and supervisor behaviors (a kind of language interface) in the runtime thus supporting large, highly fault tolerant distributed systems decades ago.

Wednesday, December 3, 2014

Security in Containerization Technology

One lingering worry with containerization is security. Previously, with conventional type 0 and type 1 (native, bare-metal) hypervisor technology, we greatly limited our trusted based to small hypervisors (e.g., Xen is < 150kloc). Some were so small (seL4 core was 7.5kloc) that they were amenable to mechanized formal verification. OSes supporting containers, in contrast, are much larger. Even CoreOS, intended as a slimmed down version of the Chrome OS Linux kernel that just supports modern bare-metal architectures for containers, that is fundamentally more challenging to vet, not to mention verify, than a simple hypervisor. Etcd and fleet alone add up to 44k sloc of Go. So for all the great inroads we were making in verification, the move towards containerization in the data center brings new challenges and potentially resets some of the progress the community has made in mechanically verifying security and functional correctness of the lowest layers of software systems and infrastructure.

Tuesday, December 2, 2014

ClusterHQ Flocker

Flocker does multi-host orchestration for Docker containers. It is intended mainly as a means for containerizing and orchestration distributed data stores and databases although in principle it can deploy any app. Unlike some of the other solutions out it, Flocker aims to support checkpointing of stateful and stateless containers to support migration of (running) containers across nodes. This seems like a great feature if one wanted to do work stealing rescheduling of containers as the execution profile changes and other nodes become available. Flocker provides its own NAT layer for mediating communication between containers across nodes. It also supports ZFS persistent volumes to maintain state. Flocker itself does not aim to do any particularly sophisticated scheduling (c.f. Kubernetes) but instead relies on the user to supply scheduling.

Monday, December 1, 2014

Kubernetes and Decking Container Cluster Managers

Fig. 1: Kubernetes Architecture

Kubernetes manages user-defined collections of containers called pods. Note that "pod" refers to the running container and not a static image (in Docker terminology). Besides containers, a pod can also have persistent storage attached as a volume and also define custom container health checks. Pods themselves can be organized together into "groups", a kind of "API object", which in turn can be referenced by label. There are two other main forms of API objects: replication controllers and services. The former produces a fixed number of replicas of a pod template. The latter defines internal and external ports for establishing connectivity across pods.