Friday, April 15, 2016

Cheap Docker images with Nix

Let's talk about Docker and Nix today. Before explaining what Nix is, if you don't know yet, and before going into the details, I will show you a snippet similar to a Dockerfile for creating a Redis image equivalent to the one in docker hub.

The final image will be around 42mb (or 25mb) in size, compared to 177mb.

EDIT: as mentioned on HN, alpine-based images can even go around 15mb in size.

If you want to try this, the first step is to install Nix.

Here's the redis.nix snippet:

Build it with: nix-build redis.nix
Load it with: docker load < result

Once loaded, you can see with docker images that it takes about 42mb of space.

Fundamental differences with classic docker builds

  • We do not use any base image, like it's done for most docker images including redis from the hub. It starts from scratch. In fact, we set up some basic shadow-related files with the shadowSetup utility, enough to add the redis user and make gosu work.
  • The Redis package is not being compiled inside Docker. It's being done by Nix, just like any other package.
  • The built image has only one layer, compared to dozens usually spitted by a readable Dockerfile. In our case, having multiple layers is useless because caching is handled by Nix, and not by Docker.

A smaller image

We can cut the size down to 25mb by avoid using id from coreutils. As an example we'll always launch redis without the entrypoint:

You might ask: but coreutils is still needed for the chown, mkdir and other commands like that!

The secret is that those commands are only used at build time and are not required at runtime in the container. Nix is able to detect that automatically for us.

It means we don't need to manually remove packages after the container is built, like with other package managers! See this line in Redis Dockerfile for example.

Using a different redis version

Let's say we want to build a Docker image with Redis 2.8.23. First we want to write a package (or derivation in Nix land) for it, and then use that inside the image:

Note we also added the tag 2.8.23 to the resulting image. And that's it. The beauty is that we reuse the same redis expression from nixpkgs, but we override only the version to build.

A generic build

There's more you can do with Nix. Being a language, it's possible to create a generic function for building Redis images given a specific package:

We created a "redisImage" function that takes a "redis" parameter as input, and returns a Docker image as output.

Build it with:
  • nix-build redis-generic.nix -A redisDocker_3_0_7 
  • nix-build redis-generic.nix -A redisDocker_2_8_23

Building off a base image

One of the selling points of Docker is reusing an existing image to add more stuff on top of it.

Nix comes with a completely different set of packages compared to other distros, with its own toolchain and glibc version. This doesn't mean it's not possible to base a new image off an existing Debian image for instance.

By using dockerTools.pullImage it's also possible to pull images from the Docker hub.

Build it with: nix-build redis-generic.nix -A redisOnDebian.

Note that we added a couple of things. We pass the base image (debianImage), to our generic redisImage function, and that we only initialize shadow-utils if the base image is null.

The result is a Docker image based off latest Debian but running Redis compiled with nixpkgs toolchain and using nixpkgs glibc. It's about 150mb. It has all the layers from the base image, plus the new single layer for Redis.

That said, it's as well possible to use one of the previously defined Redis images as base image. The result of `pullImage` and `buildImage` is a .tar.gz docker image in both cases.

You realize it's possible to build something quite similar to docker-library using only Nix expressions. It might be an interesting project.

Be aware that things like PAM configurations, or other stuff, created to be suitable for Debian may not work with Nix programs that use a different glibc.

Other random details

The code above has been made possible by using nixpkgs commit 3ae4d2afe (2016-04-14) onwards, commit at which I've finally packaged gosu and since the size of the derivations have been notably reduced.

Building the image is done without using any of the Docker commands. The way it works is as follows:
  1. Create a layer directory with all the produced contents inside. This includes the filesystem as well as the json metadata. This process will use certain build dependencies (like coreutils, shadow-utils, bash, redis, gosu, ...).
  2. Ask Nix what are the runtime dependencies of the layer directory (like redis, gosu). Such dependencies will be always a subset of the build dependencies.
  3. Add such runtime dependencies to the layer directory.
  4. Pack the layer in a .tar.gz by following the Docker specification.
I'd like to state that Nix has a safer and easier caching of operations while building the image.
As for Docker, great care has to be taken in order to use the layer cache correctly, because such caching is solely based on the RUN command string. This blog post explains it well.
This is not the case for Nix, because every output depends on a set of exact inputs. If any of the inputs change, the output will be rebuilt.

So what is Nix?

Nix is a language and deployment tool, often used as package manager or configuration builder and system provisioning. The operating system NixOS is based on it.

The code shown above is Nix. We have used the nixpkgs repository which provides several reusable Nix expressions like redis and dockerTools.

The Nix concept is simple: write a Nix expression, build it. This is how the building process works at a high-level:
  1. Read a Nix expression
  2. Evaluate it and determine the thing (called derivation) to be built.
  3. By evaluating the code, Nix is able to determine exactly the build inputs needed for such derivation.
  4. Build (or fetch from cache) all the needed inputs.
  5. Build (or fetch from the cache) the final derivation.
Nix stores all such derivations in a common nix store (usually /nix/store), identified by an hash. Each derivation may have dependencies to other paths in the same store. Each derivation is stored in a separate directory from other derivations.

Won't go deeper as there's plenty of documentation about how Nix works and how its storage works.

Hope you enjoyed the reading, and that you may give Nix a shot.


Anonymous said...

It was really annoying to put the definition/overview of Nix at the bottom, I guess you went for some kind writing trick, but it really makes it horrible for someone to learn what you're trying to showcase.

Luca Bruno aka Lethalman said...

Thanks for the critique. My idea was to just show some code first, to compare it at a high-level with a Dockerfile at first sight.

Personally I often skip introductions in technical blog posts, and go straight in the middle of the article to see some code first. So I thought about moving this logic directly in a post.

Colonel Panic said...

I agree with Anonymous.

Anonymous said...

MB, not mb.....

Anonymous said...

because we all thought he was talking about millibits

Anonymous said...

I'm struggling to reproduce the steps.

I installed Nix by using the install instructions that are linked to at the top of the blog. Then I took the first code snippet and put it into redis.nix after which I ran nix-build redis.nix

At this point I get the following error:

error: attribute ‘gosu’ missing, at /home//redis.nix:11:14
(use ‘--show-trace’ to show detailed location information)

Any idea why it doesn't work?

Anonymous said...

To last poster: Lethalman mentioned that his `redis.nix` only works started on a certain, very recent, commit. You must be on an older commit of nixpkgs.

Anonymous said...

Great post. I'm really excited about Nix. I do have a question though - isn't using Nix with Docker kind of an odd combination? For instance was does building a a Docker image using Nix give you over creating a regular Dockerfile to build a Docker image? Maybe this was just for demonstration purposes?

Charles Strahan said...

Q: "What does building a a Docker image using Nix give you over creating a regular Dockerfile to build a Docker image?"


* Better abstraction (e.g. the example of a function that produces docker images)
* The Hydra build/CI server obviates the need for paying for (or administering a self hosted) docker registry, and avoids the imperative push and pull model. Because a docker image is just another Nix package, you get distributed building, caching and signing for free.
* Because Nix caches intermediate packages builds, building a Docker image via Nix will likely be faster than letting Docker do it.
* Determinism. With Docker, you're not guaranteed that you'll build the same image across two machines (imagine the state of package repositories changing -- it's trivial to find different versions of packages across two builds of the same Dockerfile). With Nix, you're guaranteed that you have the same determinism that any other Nix package has (e.g. everything builds in a chroot without network access (unless you provide a hash of the result, for e.g. tarball downloads))

Ignasi Marimon-Clos i Sunyol said...

@CharlesStrahan: You had me at 'determinism'.

Also, the idea of leaving Docker registry behind in favor of Hydra sounds appealing. Will start tinkering with all this bits.

@lethalman: Thanks for the great post.

Pádraig Brady said...

RE coreutils size, if you do want coreutils in the final image, building in "shared binary" mode can get all 100 utils for about 1MB. The only change should be to `./configure --enable-single-binary` in the build. See the coreutils-single package in Fedora for a concrete example

Yann said...

That's great. It got me thinking... what about going even further, and removing the glibc from the picture?
I mean, it's like 27MB by itself before compression

I extended your recipe with the following:
- su-exec instead of gosu
- removing various useless binaries from the redis image (only redis-server is really needed)
- statically built everything with musl

the result is there:

and the corresponding docker image is 1.2 MB:

$ docker images redis
redis 3.0.7-mini 5986d6c71430 46 years ago 1.187 MB

Luca Bruno aka Lethalman said...

Awesome Yann! Yes, it's perfectly possible to go down to that size. I wanted in first place to emulate as much as possible the original redis image from the hub, in terms of operations.

I didn't think of musl, it could be very interesting to create a framework in nixpkgs around it for creating such docker images.

Gabriel Gonzalez said...

Is there a workaround to build this on OS X? At least one dependency of the nix expression is not supported on Darwin.

PierreR said...

I have had a got with:

But I ended up with a docker image > 2G (full ghc copied). What's wrong ? I am using the stable 16.03 channel (maybe this needs something more recent) ?

Luca Bruno aka Lethalman said...

Stable 16.03 does not have the multi outputs stuff that is in the unstable channel. Also don't know about much about the state of ghc in nix, and its multi outputs support.
Surely using the unstable channel will be much better.

Rafał Łasocha said...

NixOS + Docker looks like splendid combination. Thank you.

Mathieu Bruyen said...

To have the first example working I had to replace 'chown' by '${coreutils}/bin/chown' at two places, otherwise it does not find the command (thus I'm not sure it's possible to exclude coreutils from the image).

Luca Bruno aka Lethalman said...

If the is using chown, well it's not possible to exclude coreutils. But if you only run straight redis it is possible.

Neználek said...

We now have a PR for coreutils --enable-single-binary:

Priyanka Rawal said...

Spot on! exactly what am looking for.


Alexey Shmalko said...

You can further decrease image size if you use docker's built-in ability to change users:

config.User = "redis";

Peter Hoeg said...

This is great! Any change you would consider adding support for creating a rkt container?

Anonymous said...

> "Nix is able to detect that automatically for us."

How? How can Nix know which files are being read? Without more fine-grained explicit permissions, how is it possible to know what a given program might need upon runtime?

Interview Gig said...

Great Article About Docker Here you can find some frequently asked
Docker Interview Questions and Answers with details

Interview Gig said...

Your Collection of Interview Questions are very useful. Here You can find some Frequently Asked DevOps Interview Questions and Answers with explanation

Chef Interview Questions and Answers

Docker Interview Questions and Answers

GIT Interview Questions and Answers

Jenkins Interview Questions and Answers

Maven Interview Questions and Answers

Nagios Interview Questions and Answers

Puppet Interview Questions and Answers