Architecture

Abstraction Level

ZeroVM abstraction is a C99 compliant environment with certain parts of POSIX syscall API implemented. ZeroVM doesn't expose any non C99 or non POSIX API. All ZeroVM magic is handled transparently to the application. In best POSIX/UNIX traditions all IO to and from ZeroVM is modeled as files. Input data is presented to application as STDIN, log as STDERR and output as STDOUT. Communication channels with peer ZeroVM instances are also presented as files. The rest of the visible file–system is all transient and memory–backed in current implementation. Standard C99 library and major part of POSIX is available, however, there are some behavioral deviations from what would be expected as "normal" implementation. For example, since ZeroVM is deterministic, time functions always return zero. We assume it is within C99 standard. It could be interpreted by the application as if it is running on an infinitely fast computer. Threading is cooperative (handled automatically) and deterministic, hence all thread synchronization primitives are just no–ops. Developing for ZeroVM requires using the provided cross–compilation GNU toolchain.

Secure Isolation

ZeroVM uses Chromium NaCl project as secure sandboxing technology. NaCl is widely accepted as being a serious secure isolation technology. It is based on SFI techniques that isolate untrusted code from host system and there are a number of Google research papers published on the matter. We do our best to timely keep all security–related code in–sync with most–recent Chromium releases. ZeroVM isolation is as secure as the Chrome Browser Native Client feature. In addition to NaCl, there are a few more layers of security (a.k.a. "defense–in–depth"). One, is strict determinism which is by itself, assuming no bugs, an impenetrable line. Then, ZeroVM severs its privileges to the minimum before transferring control to the application. We could go as far as using SECCOMP for non–networking mode. For networking mode, we use cgroup/lxc as another layer of security.

Determinism

Functional programming with C99 and even native assembly? Can you believe it? We achieve it by compiling C99 into "deterministic subset" of target ISA. Currently, it works only with x86–64 and ARM. Conceptually, it could be implemented for any ISA, including synthetic ones like LLVM and even JVM. Challenges arise with some parts of C99 which require "unsafe" functionality, such as time functions. When faced with such challenges we strive to find creative and elegant solutions without breaking C99 compliance. Aside from implementation details, let's ask why we need determinism in the first place? Developer productivity improvements are well documented. However, it is not the real reason for determinism in ZeroVM. Determinism enables ZeroVM to provide an easy and fully automatic error–recovery and failover, allows it to easily hop the VM from server to server and even enables ACID transactional semantics on the VM level. Without determinism, it is challenging to cleanly separate infrastructure housekeeping from application functionality itself.

Threading Modeled

ZeroVM itself is a single–threaded application. However, ZeroVM implements full POSIX pthreads library (currently in alpha). The implementation is exclusively based on cooperative–multitasking concept. Yielding is handled automatically when ZeroVM is called by application through the provided syscalls. To support determinism, an application is never preemptively intercepted (except when it violates security constraints or resource usage caps). It can also be considered as a form of co–routines just on the VM POSIX level and not on the language level.

Clustering

ZeroVM itself is single–threaded and there is only one way to to achieve parallelism — ZeroVM clustering. It is inspired by CSP and its better–known brother actor model, It also resembles good old UNIX/POSIX processes models. All IO, including networking, is represented as files. The easiest way to understand ZeroVM clustering is to think of it as UNIX Pipelines on steroids. The ZeroVM clustering model is richer and can include advanced communication patterns such as req–rep, pub–sub and etc… It can also be thought of in "Erlang–in–C" terms. ZeroVM clustering is currently backed underneath by ZeroMQ. ZeroVM networking traffic is fully isolated from host traffic by lightweight envelopes. This prevents untrusted applications from accessing the host and other nodes within the host network. This is a low–cost approach to overlayed networks. Underlying transport that backs up pipes is fully pluggable and transparent to the application. The clustering feature is designed to be efficient for the full range of parallelism granularity, starting from many–core parallelism and ending with multi–datacenter parallelism.