Tuesday, September 16, 2014

Nix pill 15: nix search paths

Welcome to the 15th Nix pill. In the previous 14th pill we have introduced the "override" pattern, useful for writing variants of derivations by passing different inputs.

Assuming you followed the previous posts, I hope you are now ready to understand nixpkgs. But we have to find nixpkgs in our system first! So this is the step: introducing some options and environment variables used by nix tools.

The NIX_PATH


The NIX_PATH environment variable is very important. It's very similar to the PATH environment variable. The syntax is similar, several paths are separated by a colon ":". Nix will then search for something in those paths from left to right.

Who uses NIX_PATH? The nix expressions! Yes, NIX_PATH is not of much use by the nix tools themselves, rather it's used when writing nix expressions.

In the shell for example, when you execute the command "ping", it's being searched in the PATH directories. The first one found is the one being used.

In nix it's exactly the same, however the syntax is different. Instead of just typing "ping" you have to type <ping>. Yes, I know... you are already thinking of <nixpkgs>.
However don't stop reading here, let's keep going.

What's NIX_PATH good for? Nix expressions may refer to an "abstract" path such as <nixpkgs>, and it's possible to override it from the command line.

For ease we will use nix-instantiate --eval to do our tests. I remind you, nix-instantiate is used to evaluate nix expressions and generate the .drv files. Here we are not interested in building derivations, so evaluation is enough. It can be used for one-shot expressions.

Fake it a little


It's useless from a nix view point, but I think it's useful for your own understanding. Let's use PATH itself as NIX_PATH, and try to locate ping (or another binary if you don't have it).
$ nix-instantiate --eval -E '<ping>'
error: file `ping' was not found in the Nix search path (add it using $NIX_PATH or -I)
$ NIX_PATH=$PATH nix-instantiate --eval -E '<ping>'
/bin/ping
$ nix-instantiate -I /bin --eval -E '<ping>'
/bin/ping
Great. At first attempt nix obviously said  could not be found anywhere in the search path. Note that the -I option accepts a single directory. Paths added with -I take precedence over NIX_PATH.

The NIX_PATH also accepts a different yet very handy syntax: "somename=somepath". That is, instead of searching inside a directory for a name, we specify exactly the value of that name.
$ NIX_PATH="ping=/bin/ping" nix-instantiate --eval -E '<ping>'
/bin/ping
$ NIX_PATH="ping=/bin/foo" nix-instantiate --eval -E '<ping>'
error: file `ping' was not found in the Nix search path (add it using $NIX_PATH or -I)
Note in the second case how Nix checks whether the path exists or not.

The path to repository


You are out of curiosity, right?
$ nix-instantiate --eval -E '<nixpkgs>'
/home/nix/.nix-defexpr/channels/nixpkgs
$ echo $NIX_PATH
nixpkgs=/home/nix/.nix-defexpr/channels/nixpkgs
You may have a different path, depending on how you added channels etc.. Anyway that's the whole point. The <nixpkgs> stranger that we used in our nix expressions, is referring to a path in the filesystem specified by NIX_PATH.
You can list that directory and realize it's simply a checkout of the nixpkgs repository at a specific commit (hint: .version-suffix).
The NIX_PATH variable is exported by nix.sh, and that's the reason why I always asked you to source nix.sh at the beginning of my posts.

You may wonder: then I can also specify a different nixpkgs path to, e.g., a git checkout of nixpkgs? Yes, you can and I encourage doing that. We'll talk about this in the next pill.

Let's define a path for our repository, then! Let's say all the default.nix, graphviz.nix etc. are under /home/nix/mypkgs:
$ export NIX_PATH=mypkgs=/home/nix/mypkgs:$NIX_PATH
$ nix-instantiate --eval '<mypkgs>'
{ graphviz = <code>; graphvizCore = <code>; hello = <code>; mkDerivation = <code>; }
As expected we got the set of our packages (well except the mkDerivation utility), that's our repository.

Until now we used nix-build directly in the directory of default.nix. However nix-build generally needs a .nix to be specified to the command line:
$ nix-build /home/nix/mypkgs -A graphviz
/nix/store/dhps2c1fnsfxssdri3yxbdiqmnhzg6yr-graphviz
$ nix-build '<mypkgs>' -A graphviz
/nix/store/dhps2c1fnsfxssdri3yxbdiqmnhzg6yr-graphviz
Yes, nix-build also accepts paths with angular brackets. We first evaluate the whole repository (default.nix) and then peek the graphviz attribute.

A big word about nix-env


The nix-env command is a little different than nix-instantiate and nix-build. Whereas nix-instantiate and nix-build require a starting nix expression, nix-env does not.

You may be crippled by this concept at the beginning, you may think nix-env uses NIX_PATH to find the nixpkgs repository. But that's not it.

The nix-env command uses ~/.nix-defexpr, which is also part of NIX_PATH by default, but that's only a coincidence. If you empty NIX_PATH, nix-env will still be able to find derivations because of ~/.nix-defexpr.

So if you run nix-env -i graphviz inside your repository, it will install the nixpkgs one. Same if you set NIX_PATH to point to your repository.

In order to specify an alternative to ~/.nix-defexpr it's possible to use the -f option:
$ nix-env -f '<mypkgs>' -i graphviz
warning: there are multiple derivations named `graphviz'; using the first one
replacing old `graphviz'
installing `graphviz'
Oh why did it say there's another derivation named graphviz? Because both graphviz and graphvizCore attributes in our repository have the name "graphviz" for the derivation:
$ nix-env -f '<mypkgs>' -qaP
graphviz      graphviz
graphvizCore  graphviz
hello         hello
By default nix-env parses all derivations and use the derivation names to interpret the command line. So in this case "graphviz" matched two derivations. Alternatively, like for nix-build, one can use -A to specify an attribute name instead of a derivation name:
$ nix-env -f '<mypkgs>' -i -A graphviz
replacing old `graphviz'
installing `graphviz'
This form, other than being more precise, it's also faster because nix-env does not have to parse all the derivations.

For completeness: you may install graphvizCore with -A, since without the -A switch it's ambiguous.

In summary, it may happen when playing with nix that nix-env peeks a different derivation than nix-build. In such case you probably specified NIX_PATH, but nix-env is instead looking into ~/.nix-defexpr.

Why is nix-env having this different behavior? I don't know specifically by myself either, but the answers could be:
  • nix-env tries to be generic, thus it does not look for "nixpkgs" in NIX_PATH, rather it looks in ~/.nix-defexpr.
  • nix-env is able to merge multiple trees in ~/.nix-defexpr by looking at all the possible derivations
It may also happen to you that you cannot match a derivation name when installing, because of the derivation name vs -A switch described above. Maybe nix-env wanted to be more friendly in this case for default user setups.

It may or may not make sense for you, or it's like that for historical reasons, but that's how it works currently, unless somebody comes up with a better idea.

Conclusion


The NIX_PATH variable is the search path used by nix when using the angular brackets syntax. It's possible to refer to "abstract" paths inside nix expressions and define the "concrete" path by means of NIX_PATH, or the usual -I flag in nix tools.

We've also explained some of the uncommon nix-env behaviors for newcomers. The nix-env tool does not use NIX_PATH to search for packages, but rather for ~/.nix-defexpr. Beware of that!

In general do not abuse NIX_PATH, when possible use relative paths when writing your own nix expressions. Of course, in the case of <nixpkgs> in our repository, that's a perfectly fine usage of NIX_PATH. Instead, inside our repository itself, refer to expressions with relative paths like ./hello.nix.


Next pill


...we will finally dive into nixpkgs. Most of the techniques we have developed in this series are already in nixpkgs, like mkDerivation, callPackage, override, etc., but of course better. With time, those base utilities get enhanced by the community with more features in order to handle more and more use cases and in a more general way.

To be notified about the new pill, stay tuned on #NixPills, follow @lethalman or subscribe to the nixpills rss.

Wednesday, September 10, 2014

Nix pill 14: the override design pattern

Welcome to the 14th Nix pill. In the previous 13th pill we have introduced the callPackage pattern, used to simplify the composition of software in a repository.

The next design pattern is less necessary but useful in many cases and it's a good exercise to learn more about Nix.

About composability


Functional languages are known for being able to compose functions. In particular, you gain a lot from functions that are able to manipulate the original value into a new value having the same structure. So that in the end we're able to call multiple functions to have the desired modifications.

In Nix we mostly talk about functions that accept inputs in order to return derivations. In our world we want nice utility functions that are able to manipulate those structures. These utilities add some useful properties to the original value, and we must be able to apply more utilities on top of it.

For example let's say we have an initial derivation drv and we want it to be a drv with debugging information and also to apply some custom patches:
debugVersion (applyPatches [ ./patch1.patch ./patch2.patch ] drv)
The final result will be still the original derivation plus some changes. That's both interesting and very different from other packaging approaches, which is a consequence of using a functional language to describe packages.

Designing such utilities is not trivial in a functional language that is not statically typed, because understanding what can or cannot be composed is difficult. But we try to do the best.

The override pattern


In the pill 12 we introduced the inputs design pattern. We do not return a derivation picking dependencies directly from the repository, rather we declare the inputs and let the callers pass the necessary arguments.

In our repository we have a set of attributes that import the expressions of the packages and pass these arguments, getting back a derivation. Let's take for example the graphviz attribute:
graphviz = import ./graphviz.nix { inherit mkDerivation gd fontconfig libjpeg bzip2; };
If we wanted to produce a derivation of graphviz with a customized gd version, we would have to repeat most of the above plus specifying an alternative gd:
mygraphviz = import ./graphviz.nix {
  inherit mkDerivation fontconfig libjpeg bzip2;
  gd = customgd;
};
That's hard to maintain. Using callPackage it would be easier:
mygraphviz = callPackage ./graphviz.nix { gd = customgd; };
But we may still be diverging from the original graphviz in the repository.

We would like to avoid specifying the nix expression again, instead reuse the original graphviz attribute in the repository and add our overrides like this:
mygraphviz = graphviz.override { gd = customgd; };
The difference is obvious, as well as the advantages of this approach.

Note: that .override is not a "method" in the OO sense as you may think. Nix is a functional language. That .override is simply an attribute of a set.

The override implementation


I remind you, the graphviz attribute in the repository is the derivation returned by the function imported from graphviz.nix. We would like to add a further attribute named "override" to the returned set.

Let's start simple by first creating a function "makeOverridable" that takes a function and a set of original arguments to be passed to the function.

Contract: the wrapped function must return a set.

Let's write a lib.nix:
{
  makeOverridable = f: origArgs:
    let
      origRes = f origArgs;
    in
      origRes // { override = newArgs: f (origArgs // newArgs); };
}
So makeOverridable takes a function and a set of original arguments. It returns the original returned set, plus a new override attribute.
This override attribute is a function taking a set of new arguments, and returns the result of the original function called with the original arguments unified with the new arguments. What a mess.

Let's try it with nix-repl:
$ nix-repl
nix-repl> :l lib.nix
Added 1 variables.
nix-repl> f = { a, b }: { result = a+b; }
nix-repl> f { a = 3; b = 5; }
{ result = 8; }
nix-repl> res = makeOverridable f { a = 3; b = 5; }
nix-repl> res
{ override = «lambda»; result = 8; }
nix-repl> res.override { a = 10; }
{ result = 15; }
Note that the function f does not return the plain sum but a set, because of the contract. You didn't forget already, did you? :-)
The variable res is the result of the function call without any override. It's easy to see in the definition of makeOverridable. In addition you can see the new override attribute being a function.

Calling that .override with a set will invoke the original function with the overrides, as expected.
But: we can't override again! Because the returned set with result 15 does not have an override attribute!
That's bad, it breaks further compositions.

The solution is simple, the .override function should make the result overridable again:
rec {
  makeOverridable = f: origArgs:
    let
      origRes = f origArgs;
    in
      origRes // { override = newArgs: makeOverridable f (origArgs // newArgs); };
}
Please note the rec keyword. It's necessary so that we can refer to makeOverridable from makeOverridable itself.

Now let's try overriding twice:
nix-repl> :l lib.nix
Added 1 variables.
nix-repl> f = { a, b }: { result = a+b; }
nix-repl> res = makeOverridable f { a = 3; b = 5; }
nix-repl> res2 = res.override { a = 10; }
nix-repl> res2
{ override = «lambda»; result = 15; }
nix-repl> res2.override { b = 20; }
{ override = «lambda»; result = 30; }

Success! The result is 30, as expected because a is overridden to 10 in the first override, and b to 20.

Now it would be nice if callPackage made our derivations overridable. That was the goal of this pill after all. This is an exercise for the reader.

Conclusion


The "override" pattern simplifies the way we customize packages starting from an existing set of packages. This opens a world of possibilities about using a central repository like nixpkgs, and defining overrides on our local machine without even modifying the original package.

Dream of a custom isolated nix-shell environment for testing graphviz with a custom gd:
debugVersion (graphviz.override { gd = customgd; })
Once a new version of the overridden package comes out in the repository, the customized package will make use of it automatically.

The key in Nix is to find powerful yet simple abstractions in order to let the user customize his environment with highest consistency and lowest maintenance time, by using predefined composable components.

Next pill


...we will talk about Nix search paths. By search path I mean a place in the file system where Nix looks for expressions. You may have wondered, where does that holy <nixpkgs> come from?

Pill 15 is available for reading here.

To be notified about the new pill, stay tuned on #NixPills, follow @lethalman or subscribe to the nixpills rss.

Thursday, September 04, 2014

Nix pill 13: the callPackage design pattern

Welcome to the 13th Nix pill. In the previous 12th pill we have introduced the first basic design pattern for organizing a repository of software. In addition we packaged graphviz to have at least another package for our little repository.

The next design pattern worth noting is what I'd like to call the callPackage pattern. This technique is extensively used in nixpkgs, it's the current standard for importing packages in a repository.

The callPackage convenience


In the previous pill, we underlined the fact that the inputs pattern is great to decouple packages from the repository, in that we can pass manually the inputs to the derivation. The derivation declares its inputs, and the caller passes the arguments.
However as with usual programming languages, we declare parameter names, and then we have to pass arguments. We do the job twice.
With package management, we often see common patterns. In the case of nixpkgs it's the following.

Some package derivation:
{ input1, input2, ... }:
...
Repository derivation:
rec {
  lib1 = import package1.nix { inherit input1 input2 ...; };
  program2 = import package1.nix { inherit inputX inputY lib1 ...; };
}
Where inputs may even be packages in the repository itself (note the rec keyword). The pattern here is clear, often inputs have the same name of the attributes in the repository itself. Our desire is to pass those inputs from the repository automatically, and in case be able to specify a particular argument (that is, override the automatically passed default argument).

To achieve this, we will define a callPackage function with the following synopsis:
{
  lib1 = callPackage package1.nix { };
  program2 = callPackage package2.nix { someoverride = overriddenDerivation; };
}
What should it do?
  • Import the given expression, which in turn returns a function.
  • Determine the name of its arguments.
  • Pass default arguments from the repository set, and let us override those arguments.

Implementing callPackage


First of all, we need a way to introspect (reflection or whatever) at runtime the argument names of a function. That's because we want to automatically pass such arguments.

Then callPackage requires access to the whole packages set, because it needs to find the packages to pass automatically.

We start off simple with nix-repl:
nix-repl> add = { a ? 3, b }: a+b
nix-repl> builtins.functionArgs add
{ a = true; b = false; }
Nix provides a builtin function to introspect the names of the arguments of a function. In addition, for each argument, it tells whether the argument has a default value or not. We don't really care about default values in our case. We are only interested in the argument names.

Now we need a set with all the values, let's call it values.  And a way to intersect the attributes of values with the function arguments:
nix-repl> values = { a = 3; b = 5; c = 10; }
nix-repl> builtins.intersectAttrs values (builtins.functionArgs add)
{ a = true; b = false; }
nix-repl> builtins.intersectAttrs (builtins.functionArgs add) values
{ a = 3; b = 5; }
Perfect, note from the example above that the intersectAttrs returns a set whose names are the intersection, and the attribute values are taken from the second set.

We're done, we have a way to get argument names from a function, and match with an existing set of attributes. This is our simple implementation of callPackage:
nix-repl> callPackage = set: f: f (builtins.intersectAttrs (builtins.functionArgs f) set)
nix-repl> callPackage values add
8
nix-repl> with values; add a b
8
Clearing up the syntax:
  • We define a callPackage variable which is a function.
  • First it accepts a set, and it returns another function accepting another parameter. In other words, let's simplify by saying it accepts two parameters.
  • The second parameter is the function to "autocall".
  • We take the argument names of the function and intersect with the set of all values.
  • Finally we call the passed function f with the resulting intersection.
In the code above, I've also shown that the callPackage call is equivalent to directly calling add a b.

We achieved what we wanted. Automatically call functions given a set of possible arguments. If an argument is not found in the set, that's nothing special. It's a function call with a missing parameter, and that's an error (unless the function has varargs ... as explained in the 5th pill).

Or not. We missed something. Being able to override some of the parameters. We may not want to always call functions with values taken from the big set.
Then we add a further parameter, which takes a set of overrides:
nix-repl> callPackage = set: f: overrides: f ((builtins.intersectAttrs (builtins.functionArgs f) set) // overrides)
nix-repl> callPackage values add { }
8
nix-repl> callPackage values add { b = 12; }
15
Apart from the increasing number of parenthesis, it should be clear that we simply do a set union between the default arguments, and the overriding set.


Use callPackage to simplify the repository


Given our brand new tool, we can simplify the repository expression (default.nix).
Let me write it down first:
let
  nixpkgs = import <nixpkgs> {};
  allPkgs = nixpkgs // pkgs;
  callPackage = path: overrides:
    let f = import path;
    in f ((builtins.intersectAttrs (builtins.functionArgs f) allPkgs) // overrides);
  pkgs = with nixpkgs; {
    mkDerivation = import ./autotools.nix nixpkgs;
    hello = callPackage ./hello.nix { };
    graphviz = callPackage ./graphviz.nix { };
    graphvizCore = callPackage ./graphviz.nix { gdSupport = false; };
  };
in pkgs
Wow, there's a lot to say here:
  • We renamed the old pkgs of the previous pill to nixpkgs. Our package set is now instead named pkgs. Sorry for the confusion.
  • We needed a way to pass pkgs to callPackage somehow. Instead of returning the set of packages directly from default.nix, we first assign it to a let variable and reuse it in callPackage.
  • For convenience, in callPackage we first import the file, instead of calling it directly. Otherwise for each package we would have to write the import.
  • Since our expressions use packages from nixpkgs, in callPackage we use allPkgs, which is the union of nixpkgs and our packages.
  • We moved mkDerivation in pkgs itself, so that it gets also passed automatically.
Note how easy is to override arguments in the case of graphviz without gd. But most importantly, how easy it was to merge two repositories: nixpkgs and our pkgs!

The reader should notice a magic thing happening. We're defining pkgs in terms of callPackage, and callPackage in terms of pkgs. That magic is possible thanks to lazy evaluation. 

Conclusion


The "callPackage" pattern has simplified a lot our repository. We're able to import packages that require some named arguments and call them automatically, given the set of all packages.

We've also introduced some useful builtin functions that allows us to introspect Nix functions and manipulate attributes. These builtin functions are not usually used when packaging software, rather to provide tools for packaging. That's why they are not documented in the nix manual.

Writing a repository in nix is an evolution of writing convenient functions for combining the packages. This demonstrates even more how nix is a generic tool to build and deploy something, and how suitable it is to create software repositories with your own conventions.

Next pill


...we will talk about the "override" design pattern. The graphvizCore seems straightforward. It starts from graphviz.nix and builds it without gd. Now I want to give you another point of view: what if we instead wanted to start from pkgs.graphviz and disable gd?

Pill 14 is available for reading here.

To be notified about the new pill, stay tuned on #NixPills, follow @lethalman or subscribe to the nixpills rss.

Thursday, August 28, 2014

Nix pill 12: the inputs design pattern

Welcome to the 12th Nix pill. In the previous 11th pill we stopped packaging and cleaned up the system with the garbage collector.

We restart our packaging, but we will improve a different aspect. We only packaged an hello world program so far, what if we want to create a repository of multiple packages?

Repositories in Nix


Nix is a tool for build and deployment, it does not enforce any particular repository format. A repository of packages is the main usage for Nix, but not the only possibility. See it more like a consequence due to the need of organizing packages.
Nix is a language, and it is powerful enough to let you choose the format of your own repository. In this sense, it is not declarative, but functional.
There is no preset directory structure or preset packaging policy. It's all about you and Nix.

The nixpkgs repository has a certain structure, which evolved and evolves with the time. Like other languages, Nix has its own history and therefore I'd like to say that it also has its own design patterns. Especially when packaging, you often do the same task again and again except for different software. It's inevitable to identify patterns during this process. Some of these patterns get reused if the community thinks it's a good way to package the software.

Some of the patterns I'm going to show do not apply only to Nix, but to other systems of course.

The single repository pattern


Before introducing the "inputs" pattern, we can start talking about another pattern first which I'd like to call "single repository" pattern.
Systems like Debian scatter packages in several small repositories. Personally, this makes it hard to track interdependent changes and to contribute to new packages.
Systems like Gentoo instead, put package descriptions all in a single repository.

The nix reference for packages is nixpkgs, a single repository of all descriptions of all packages. I find this approach very natural and attractive for new contributions.

From now on, we will adopt this technique. The natural implementation in Nix is to create a top-level Nix expression, and one expression for each package. The top-level expression imports and combines all expressions in a giant attribute set with name -> package pairs.

But isn't that heavy? It isn't, because Nix is a lazy language, it evaluates only what's needed! And that's why nixpkgs is able to maintain such a big software repository in a giant attribute set.

Packaging graphviz


We have packaged GNU hello world, I guess you would like to package something else for creating at least a repository of two projects :-) . I chose graphviz, which uses the standard autotools build system, requires no patching and dependencies are optional.

Download graphviz from here. The graphviz.nix expression is straightforward:
let
  pkgs = import <nixpkgs> {};
  mkDerivation = import ./autotools.nix pkgs;
in mkDerivation {
  name = "graphviz";
  src = ./graphviz-2.38.0.tar.gz;
}
Build with nix-build graphviz.nix and you will get runnable binaries under result/bin. Notice how we did reuse the same autotools.nix of hello.nix. Let's create a simple png:
$ echo 'graph test { a -- b }'|result/bin/dot -Tpng -o test.png
Format: "png" not recognized. Use one of: canon cmap [...]
Oh of course... graphviz can't know about png. It built only the output formats it supports natively, without using any extra library.

I remind you, in autotools.nix there's a buildInputs variablewhich gets concatenated to baseInputs.  That would be the perfect place to add a build dependency. We created that variable exactly for this reason to be overridable from package expressions.

This 2.38 version of graphviz has several plugins to output png. For simplicity, we will use libgd.

Digression about gcc and ld wrappers


The gd, jpeg, fontconfig and bzip2 libraries (dependencies of gd) don't use pkg-config to specify which flags to pass to the compiler. Since there's no global location for libraries, we need to tell gcc and ld where to find includes and libs.

The nixpkgs provides gcc and binutils, and we are using them for our packaging. Not only, it also provides wrappers for them which allow passing extra arguments to gcc and ld, bypassing the project build systems:
  • NIX_CFLAGS_COMPILE: extra flags to gcc at compile time
  • NIX_LDFLAGS: extra flags to ld
What can we do about it? We can employ the same trick we did for PATH: automatically filling the variables from buildInputs. This is the relevant snippet of setup.sh:
for p in $baseInputs $buildInputs; do
  if [ -d $p/bin ]; then
    export PATH="$p/bin${PATH:+:}$PATH"
  fi
  if [ -d $p/include ]; then
    export NIX_CFLAGS_COMPILE="-I $p/include${NIX_CFLAGS_COMPILE:+ }$NIX_CFLAGS_COMPILE"
  fi
  if [ -d $p/lib ]; then
    export NIX_LDFLAGS="-rpath $p/lib -L $p/lib${NIX_LDFLAGS:+ }$NIX_LDFLAGS"
  fi
done
Now by adding derivations to buildInputs, will add the lib, include and bin paths automatically in setup.sh.

The -rpath flag in ld is needed because at runtime, the executable must use exactly that version of the library.
If unneeded paths are specified, the fixup phase will shrink the rpath for us!

Completing graphviz with gd


Finish the expression for graphviz with gd support (note the use of the with expression in buildInputs to avoid repeating pkgs):
let
  pkgs = import <nixpkgs> {};
  mkDerivation = import ./autotools.nix pkgs;
in mkDerivation {
  name = "graphviz";
  src = ./graphviz-2.38.0.tar.gz;
  buildInputs = with pkgs; [ gd fontconfig libjpeg bzip2 ];
}
Now you can create the png! Ignore any error from fontconfig, especially if you are in a chroot.

The repository expression


Now that we have two packages, what's a good way to put them together in a single repository? We do something like nixpkgs does. With nixpkgs, we import it and then we peek derivations by accessing the giant attribute set.
For us nixers, this a good technique, because it abstracts from the file names. We don't refer to a package by REPO/some/sub/dir/package.nix but by importedRepo.package (or pkgs.package in our examples).

Create a default.nix in the current directory:
{
  hello = import ./hello.nix; 
  graphviz = import ./graphviz.nix;
}
Ready to use! Try it with nix-repl:
$ nix-repl
nix-repl> :l default.nix
Added 2 variables.
nix-repl> hello
«derivation /nix/store/dkib02g54fpdqgpskswgp6m7bd7mgx89-hello.drv»
nix-repl> graphviz
«derivation /nix/store/zqv520v9mk13is0w980c91z7q1vkhhil-graphviz.drv»
With nix-build:
$ nix-build default.nix -A hello
[...]
$ result/bin/hello
Hello, world!
The -A argument is used to access an attribute of the set from the given .nix expression.
Important: why did we choose the default.nix? Because when a directory (by default the current directory) has a default.nix, that default.nix will be used (see import here). In fact you can run nix-build -A hello without specifying default.nix. 
For pythoners, it is similar to __init__.py.

With nix-env, to install the package in your user environment:
$ nix-env -f . -iA graphviz
[...]
$ dot -V
The -f option is used to specify the expression to use, in this case the current directory, therefore ./default.nix.
The -i stands for installation.
The -A is the same as above for nix-build.

We reproduced the very basic behavior of nixpkgs.

The inputs pattern


After a long preparation, we finally arrived. I know you have a big doubt in this moment. It's about the hello.nix and graphviz.nix. They are very, very dependent on nixpkgs:
  • First big problem: they import nixpkgs directly. In autotools.nix instead we pass nixpkgs as an argument. That's a much better approach.
  • Second problem: what if we want a variant of graphviz without libgd support?
  • Third problem: what if we want to test graphviz with a particular libgd version?
The current answer to the above questions is: change the expression to match your needs (or change the callee to match your needs).
With the inputs pattern, we choose to give another answer: let the user change the inputs of the expression (or change the caller to pass different inputs).

By inputs of an expression, we refer to the set of derivations needed to build that expression. In this case:
  • mkDerivation from autotools. Recall that mkDerivation has an implicit dependency on the toolchain.
  • libgd and its dependencies.
The src is also an input but it's pointless to change the source from the caller. For version bumps, in nixpkgs we prefer to write another expression (e.g. because patches are needed or different inputs are needed).

Goal: make package expressions independent of the repository.

How do we achieve that? The answer is simple: use functions to declare inputs for a derivation. Doing it for graphviz.nix, will make the derivation independent of the repository and customizable:
{ mkDerivation, gdSupport ? true, gd, fontconfig, libjpeg, bzip2 }:

mkDerivation {
  name = "graphviz";
  src = ./graphviz-2.38.0.tar.gz;
  buildInputs = if gdSupport then [ gd fontconfig libjpeg bzip2 ] else [];
}
I recall that "{...}: ..." is the syntax for defining functions accepting an attribute set as argument.

We made gd and its dependencies optional. If gdSupport is true (by default), we will fill buildInputs and thus graphviz will be built with gd support, otherwise it won't.
Now back to default.nix:
let
  pkgs = import <nixpkgs> {};
  mkDerivation = import ./autotools.nix pkgs;
in with pkgs; {
  hello = import ./hello.nix { inherit mkDerivation; }; 
  graphviz = import ./graphviz.nix { inherit mkDerivation gd fontconfig libjpeg bzip2; };
  graphvizCore = import ./graphviz.nix {
    inherit mkDerivation gd fontconfig libjpeg bzip2;
    gdSupport = false;
  };
}
So we factorized the import of nixpkgs and mkDerivation, and also added a variant of graphviz with gd support disabled. The result is that both hello.nix (exercise for the reader) and graphviz.nix are independent of the repository and customizable by passing specific inputs.
If you wanted to build graphviz with a specific version of gd, it would suffice to pass gd = ...;.
If you wanted to change the toolchain, you may pass a different mkDerivation function.

Clearing up the syntax:
  • In the end we return an attribute set from default.nix. With "let" we define some local variables.
  • We bring pkgs into the scope when defining the packages set, which is very convenient instead of typing everytime "pkgs".
  • We import hello.nix and graphviz.nix, which will return a function, and call it with a set of inputs to get back the derivation.
  • The "inherit x" syntax is equivalent to "x = x". So "inherit gd" here, combined to the above "with pkgs;" is equivalent to "x = pkgs.gd".
You can find the whole repository at the pill 12 gist.

Conclusion


The "inputs" pattern allows our expressions to be easily customizable through a set of arguments. These arguments could be flags, derivations, or whatelse. Our package expressions are functions, don't think there's any magic in there.

It also makes the expressions independent of the repository. Given that all the needed information is passed through arguments, it is possible to use that expression in any other context.

Next pill


...we will talk about the "callPackage" design pattern. It is tedious to specify the names of the inputs twice, once in the top-level default.nix, and once in the package expression. With callPackage, we will implicitly pass the necessary inputs from the top-level expression.

Pill 13 is available for reading here.

To be notified about the new pill, stay tuned on #NixPills, follow @lethalman or subscribe to the nixpills rss.

Thursday, August 21, 2014

Nix pill 11: the garbage collector

Welcome to the 11th Nix pill. In the previous 10th pill we managed to obtain a self-contained environment for developing a project. The concept is that whereas nix-build is able to build a derivation in isolation, nix-shell is able to drop us in a shell with (almost) the same environment used by nix-build. This allows us to debug, modify and manually build software.

Today we stop packaging and look at a mandatory nix component, the garbage collector. When using nix tools, often derivations are built. This include both .drv files and out paths. These artifacts go in the nix store, and we never cared about deleting them until now.

How does it work


Other package managers, like dpkg, have somehow a way to remove unused software. However, nix is much more precise compared to other systems.
I bet with dpkg, rpm or anything else, you end up with having some unnecessary package installed or dangling files. With nix this does not happen.

How do we determine whether a store path is still needed? The same way programming languages with a garbage collector decide whether an object is still alive.

Programming languages with a garbage collector have an important concept in order to keep track of live objects: GC roots. A GC root is an object that is always alive (unless explicitly removed as GC root). All objects recursively referred to by a GC root are live.

Therefore, the garbage collection process starts from GC roots, and recursively mark referenced objects as live. All other objects can be collected and deleted.

In Nix there's this same concept. Instead of being objects, of course, GC roots are store paths. The implementation is very simple and transparent to the user. GC roots are stored under /nix/var/nix/gcroots. If there's a symlink to a store path, then that store path is a GC root.
Nix allows this directory to have subdirectories: it will simply recurse directories in search of symlinks to store paths.

So we have a list of GC roots. At this point, deleting dead store paths is as easy as you can imagine. We have the list of all live store paths, hence the rest of the store paths are dead.
In particular, Nix first moves dead store paths to /nix/store/trash which is an atomic operation. Afterwards, the trash is emptied.

Playing with the GC


Before playing with the GC, first run the nix garbage collector once, so that we have a cleaned up playground for our experiments:
$ nix-collect-garbage
finding garbage collector roots...
[...]
deleting unused links...
note: currently hard linking saves -0.00 MiB
1169 store paths deleted, 228.43 MiB freed
Perfect, if you run it again it won't find anything new to delete, as expected.
What's left in the nix store is everything being referenced from the GC roots.

Let's install for a moment bsd-games:
$ nix-env -iA nixpkgs.bsdgames
$ readlink -f `which fortune`
/nix/store/b3lxx3d3ggxcggvjw5n0m1ya1gcrmbyn-bsd-games-2.17/bin/fortune
$ nix-store -q --roots `which fortune`
/nix/var/nix/profiles/default-9-link
$ nix-env --list-generations
[...]
   9   2014-08-20 12:44:14   (current)
The nix-store command can be used to query the GC roots that refer to a given derivation. In this case, our current user environment does refer to bsd-games.

Now remove it, collect garbage and note that bsd-games is still in the nix store:
$ nix-env -e bsd-games
uninstalling `bsd-games-2.17'
$ nix-collect-garbage
[...]
$ ls /nix/store/b3lxx3d3ggxcggvjw5n0m1ya1gcrmbyn-bsd-games-2.17
bin  share
That's because the old generation is still in the nix store because it's a GC root. As we'll see below, all profiles and their generations are GC roots.

Removing a GC root is simple. Let's try deleting the generation that refers to bsd-games, collect garbage, and note that now bsd-games is no longer in the nix store:
$ rm /nix/var/nix/profiles/default-9-link
$ nix-env --list-generations
[...]
   8   2014-07-28 10:23:24   
  10   2014-08-20 12:47:16   (current)
$ nix-collect-garbage
[...]
$ ls /nix/store/b3lxx3d3ggxcggvjw5n0m1ya1gcrmbyn-bsd-games-2.17
ls: cannot access /nix/store/b3lxx3d3ggxcggvjw5n0m1ya1gcrmbyn-bsd-games-2.17: No such file or directory
Note: nix-env --list-generations does not rely on any particular metadata. It is able to list generations based solely on the file names under the profiles directory.

However we removed the link from /nix/var/nix/profiles, not from /nix/var/nix/gcroots. Turns out, that /nix/var/nix/gcroots/profiles is a symlink to /nix/var/nix/profiles. That is very handy. It means any profile and its generations are GC roots.

It's as simple as that, anything under /nix/var/nix/gcroots is a GC root. And anything not being garbage collected is because it's referred from one of the GC roots.

Indirect roots


I remind you that building the GNU hello world package with nix-build produces a result symlink in the current directory. Despite the collected garbage done above, the hello program is still working: therefore it has not been garbage collected. Clearly, since there's no other derivation that depends upon the GNU hello world package, it must be a GC root.

In fact, nix-build automatically adds the result symlink as a GC root. Yes, not the built derivation, but the symlink. These GC roots are added under /nix/var/nix/gcroots/auto .
$ ls -l /nix/var/nix/gcroots/auto/
total 8
drwxr-xr-x 2 nix nix 4096 Aug 20 10:24 ./
drwxr-xr-x 3 nix nix 4096 Jul 24 10:38 ../
lrwxrwxrwx 1 nix nix   16 Jul 31 10:51 xlgz5x2ppa0m72z5qfc78b8wlciwvgiz -> /home/nix/result/
Don't care about the name of the symlink. What's important is that a symlink exists that point to /home/nix/result. This is called an indirect GC root. That is, the GC root is effectively specified outside of /nix/var/nix/gcroots. Whatever result points to, it will not be garbage collected.

How do we remove the derivation then? There are two possibilities:
  • Remove the indirect GC root from /nix/var/nix/gcroots/auto.
  • Remove the result symlink.
In the first case, the derivation will be deleted from the nix store, and result becomes a dangling symlink. In the second case, the derivation is removed as well as the indirect root in /nix/var/nix/gcroots/auto .
Running nix-collect-garbage after deleting the GC root or the indirect GC root, will remove the derivation from the store.

Cleanup everything


What's the main source of software duplication in the nix store? Clearly, GC roots due to nix-build and profile generations. Doing a nix-build results in a GC root for a build that somehow will refer to a specific version of glibc, and other libraries. After an upgrade, if that build is not deleted by the user, it will not be garbage collected. Thus the old dependencies referred to by the build will not be deleted either.

Same goes for profiles. Manipulating the nix-env profile will create further generations. Old generations refer to old software, thus increasing duplication in the nix store after an upgrade.

What are the basic steps for upgrading and removing everything old, including old generations? In other words, do an upgrade similar to other systems, where they forget everything about the older state:
$ nix-channel --update
$ nix-env -u --always
$ rm /nix/var/nix/gcroots/auto/*
$ nix-collect-garbage -d
First, we download a new version of the nixpkgs channel, which holds the description of all the software. Then we upgrade our installed packages with nix-env -u. That will bring us into a fresh new generation with all updated software.

Then we remove all the indirect roots generated by nix-build: beware, this will result in dangling symlinks. You may be smarter and also remove the target of those symlinks.

Finally, the -d option of nix-collect-garbage is used to delete old generations of all profiles, then collect garbage. After this, you lose the ability to rollback to any previous generation. So make sure the new generation is working well before running the command.

Conclusion


Garbage collection in Nix is a powerful mechanism to cleanup your system. The nix-store commands allows us to know why a certain derivation is in the nix store.

Cleaning up everything down to the oldest bit of software after an upgrade seems a bit contrived, but that's the price of having multiple generations, multiple profiles, multiple versions of software, thus rollbacks etc.. The price of having many possibilities.

Next pill


...we will package another project and introduce what I call the "inputs" design pattern. We only played with a single derivation until now, however we'd like to start organizing a small repository of software. The "inputs" pattern is widely used in nixpkgs; it allows us to decouple derivations from the repository itself and increase customization opportunities.

Pill 12 is available for reading here.

To be notified about the new pill, stay tuned on #NixPills, follow @lethalman or subscribe to the nixpills rss.

Tuesday, August 19, 2014

Nix pill 10: developing with nix-shell

Welcome to the 10th Nix pill. In the previous 9th pill we have seen one of the powerful features of nix, automatic discovery of runtime dependencies and finalized the GNU hello world package.

In return from vacation, we want to hack a little the GNU hello world program. The nix-build tool creates an isolated environment for building the derivation, we want to do the same in order to modify some source files of the project.

What's nix-shell


The nix-shell tool drops us in a shell by setting up the necessary environment variables to hack a derivation. It does not build the derivation, it only serves as a preparation so that we can run the build steps manually.
I remind you, in a nix environment you don't have access to libraries and programs unless you install them with nix-env. However installing libraries with nix-env is not good practice. We prefer to have isolated environments for development.
$ nix-shell hello.nix
[nix-shell]$ make
bash: make: command not found
[nix-shell]$ echo $baseInputs
/nix/store/jff4a6zqi0yrladx3kwy4v6844s3swpc-gnutar-1.27.1 [...]
First thing to notice, we call nix-shell on a nix expression which returns a derivation. We then enter a new bash shell, but it's really useless. We expected to have the GNU hello world build inputs available in PATH, including GNU make, but it's not the case.
But, we have the environment variables that we set in the derivation, like $baseInputs, $buildInputs, $src and so on.

That means we can source our builder.sh, and it will build the derivation. You may get an error in the installation phase, because the user may not have the permission to write to /nix/store:
[nix-shell]$ source builder.sh
...
It didn't install, but it built. Things to notice:
  • We sourced builder.sh, therefore it ran all the steps including setting up the PATH for us.
  • The working directory is no more a temp directory created by nix-build, but the current directory. Therefore, hello-2.9 has been unpacked there.
We're able to cd into hello-2.9 and type make, because now it's available.

In other words, nix-shell drops us in a shell with the same (or almost) environment used to run the builder!

A builder for nix-shell


The previous steps are a bit annoying of course, but we can improve our builder to be more nix-shell friendly.

First of all, we were able to source builder.sh because it was in our current directory, but that's not nice. We want the builder.sh that is stored in the nix store, the one that would be used by nix-build. To do so, the right way is to pass the usual environment variable through the derivation.
Note: $builder is already defined, but it's the bash executable, not our builder.sh. Our builder.sh is an argument to bash.

Second, we don't want to run the whole builder, we only want it to setup the necessary environment for manually building the project. So we'll write two files, one for setting up the environment, and the real builder.sh that runs with nix-build.

Additionally, we'll wrap the phases in functions, it may be useful, and move the set -e to the builder instead of the setup. The set -e is annoying in nix-shell.

The codebase is becoming a little long. You can find all the files in this nixpill10 gist.
Noteworthy is the setup = ./setup.sh; attribute in the derivation, which adds setup.sh to the nix store and as usual, adds a $setup environment variable in the builder.
Thanks to that, we can split builder.sh into setup.sh and builder.sh. What builder.sh does is sourcing $setup and calling the genericBuild function. Everything else is just some bash changes.

Now back to nix-shell:
$ nix-shell hello.nix
[nix-shell]$ source $setup
[nix-shell]$
Now you can run, for example, unpackPhase which unpacks $src and enters the directory. And you can run commands like ./configure, make etc. manually, or run phases with their respective functions.

It's all that straight, nix-shell builds the .drv file and its input dependencies, then drops into a shell by setting up the environment variables necessary to build the .drv, in particular those passed to the derivation function.

Conclusion


With nix-shell we're table to drop into an isolated environment for developing a project, with the necessary dependencies just like nix-build does, except we can build and debug the project manually, step by step like you would do in any other operating system.
Note that we did never install gcc, make, etc. system-wide. These tools and libraries are available per-build.


Next pill


...we will clean up the nix store. We wrote and built derivations, added stuff to nix store, but until now we never worried about cleaning up the used space in the store. It's time to collect some garbage.

Pill 11 is available for reading here.

To be notified about the new pill, stay tuned on #NixPills, follow @lethalman or subscribe to the nixpills rss.

Friday, August 08, 2014

Nix pill 9: automatic runtime dependencies

Welcome to the 9th Nix pill. In the previous 8th pill we wrote a generic builder for autotools projects. We feed build dependencies, a source tarball, and we get a Nix derivation as a result.

Today we stop by the GNU hello world program to analyze build and runtime dependencies, and enhance the builder in order to avoid unnecessary runtime dependencies.

Build dependencies


Let's start analyzing build dependencies for our GNU hello world package:
$ nix-instantiate hello.nix
/nix/store/z77vn965a59irqnrrjvbspiyl2rph0jp-hello.drv
$ nix-store -q --references /nix/store/z77vn965a59irqnrrjvbspiyl2rph0jp-hello.drv
/nix/store/0q6pfasdma4as22kyaknk4kwx4h58480-hello-2.9.tar.gz
/nix/store/1zcs1y4n27lqs0gw4v038i303pb89rw6-coreutils-8.21.drv
/nix/store/2h4b30hlfw4fhqx10wwi71mpim4wr877-gnused-4.2.2.drv
/nix/store/39bgdjissw9gyi4y5j9wanf4dbjpbl07-gnutar-1.27.1.drv
/nix/store/7qa70nay0if4x291rsjr7h9lfl6pl7b1-builder.sh
/nix/store/g6a0shr58qvx2vi6815acgp9lnfh9yy8-gnugrep-2.14.drv
/nix/store/jdggv3q1sb15140qdx0apvyrps41m4lr-bash-4.2-p45.drv
/nix/store/pglhiyp1zdbmax4cglkpz98nspfgbnwr-gnumake-3.82.drv
/nix/store/q9l257jn9lndbi3r9ksnvf4dr8cwxzk7-gawk-4.1.0.drv
/nix/store/rgyrqxz1ilv90r01zxl0sq5nq0cq7v3v-binutils-2.23.1.drv
/nix/store/qzxhby795niy6wlagfpbja27dgsz43xk-gcc-wrapper-4.8.3.drv
/nix/store/sk590g7fv53m3zp0ycnxsc41snc2kdhp-gzip-1.6.drv
It has exactly the derivations referenced in the derivation function, nothing more, nothing less. Some of them might not be used at all, however given that our generic mkDerivation function always pulls such dependencies (think of it like build-essential of Debian), for every package you build from now on, you will have these packages in the nix store.

Why are we looking at .drv files? Because the hello.drv file is the representation of the build action to perform in order to build the hello out path, and as such it also contains the input derivations needed to be built before building hello.

Digression about NAR files


NAR is the Nix ARchive. First question: why not tar? Why another archiver? Because commonly used archivers are not deterministic. They add padding, they do not sort files, they add timestamps, etc.. Hence NAR, a very simple deterministic archive format being used by Nix for deployment.
NARs are also used extensively within Nix itself as we'll see below.

For the rationale and implementation details you can find more in the Dolstra's PhD Thesis.

To create NAR archives, it's possible to use nix-store --dump and nix-store --restore. Those two commands work regardless of /nix/store.

Runtime dependencies


Something is different for runtime dependencies however. Build dependencies are automatically recognized by Nix once they are used in any derivation call, but we never specify what are the runtime dependencies for a derivation.

There's really black magic involved. It's something that at first glance makes you think "no, this can't work in the long term", but at the same it works so well that a whole operating system is built on top of this magic.

In other words, Nix automatically computes all the runtime dependencies of a derivation, and it's possible thanks to the hash of the store paths.

Steps:
  1. Dump the derivation as NAR, a serialization of the derivation output. Works fine whether it's a single file or a directory.
  2. For each build dependency .drv and its relative out path, search the contents of the NAR for this out path.
  3. If found, then it's a runtime dependency.

You get really all the runtime dependencies, and that's why Nix deployments are so easy.
$ nix-instantiate hello.nix
/nix/store/z77vn965a59irqnrrjvbspiyl2rph0jp-hello.drv
$ nix-store -r /nix/store/z77vn965a59irqnrrjvbspiyl2rph0jp-hello.drv
/nix/store/a42k52zwv6idmf50r9lps1nzwq9khvpf-hello
$ nix-store -q --references /nix/store/a42k52zwv6idmf50r9lps1nzwq9khvpf-hello
/nix/store/94n64qy99ja0vgbkf675nyk39g9b978n-glibc-2.19
/nix/store/8jm0wksask7cpf85miyakihyfch1y21q-gcc-4.8.3
/nix/store/a42k52zwv6idmf50r9lps1nzwq9khvpf-hello
Ok glibc and gcc. Well, gcc really should not be a runtime dependency!
$ strings result/bin/hello|grep gcc
/nix/store/94n64qy99ja0vgbkf675nyk39g9b978n-glibc-2.19/lib:/nix/store/8jm0wksask7cpf85miyakihyfch1y21q-gcc-4.8.3/lib64
Oh Nix added gcc because its out path is mentioned in the "hello" binary. Why is that? That's the ld rpath. It's the list of directories where libraries can be found at runtime. In other distributions, this is usually not abused. But in Nix, we have to refer to particular versions of libraries, thus the rpath has an important role.

The build process adds that gcc lib path thinking it may be useful at runtime, but really it's not. How do we get rid of it? Nix authors have written another magical tool called patchelf, which is able to reduce the rpath to the paths that are really used by the binary.

Not only, even after reducing the rpath the hello binary would still depend upon gcc. Because of debugging information. For that, the well known strip can be used.

Another phase in the builder


We will add a new phase to our autotools builder. The builder has these phases already:
  1. First the environment is set up
  2. Unpack phase: we unpack the sources in the current directory (remember, Nix changes dir to a temporary directory first)
  3. Change source root to the directory that has been unpacked
  4. Configure phase: ./configure
  5. Build phase: make
  6. Install phase: make install
We add a new phase after the installation phase, which we call fixup phase. At the end of the builder.sh follows:
find $out -type f -exec patchelf --shrink-rpath '{}' \; -exec strip '{}' \; 2>/dev/null
That is, for each file we run patchelf --shrink-rpath and strip. Note that we used two new commands here, find and patchelf. These two deserve a place in baseInputs of autotools.nix as findutils and patchelf.

Rebuild hello.nix and...:
$ nix-build hello.nix
[...]
$ nix-store -q --references result
/nix/store/94n64qy99ja0vgbkf675nyk39g9b978n-glibc-2.19
/nix/store/md4a3zv0ipqzsybhjb8ndjhhga1dj88x-hello
...only glibc is the runtime dependency. Exactly what we wanted.

The package is self-contained, copy its closure on another machine and you will be able to run it. I remind you the very few components under the /nix/store necessary to run nix when we installed it. The hello binary will use that exact version of glibc library and interpreter, not the system one:
$ ldd result/bin/hello
 linux-vdso.so.1 (0x00007fff11294000)
 libc.so.6 => /nix/store/94n64qy99ja0vgbkf675nyk39g9b978n-glibc-2.19/lib/libc.so.6 (0x00007f7ab7362000)
 /nix/store/94n64qy99ja0vgbkf675nyk39g9b978n-glibc-2.19/lib/ld-linux-x86-64.so.2 (0x00007f7ab770f000)
Of course, the executable runs fine as long as everything is under the /nix/store path.

Conclusion


Short post compared to previous ones as I'm still on vacation, but I hope you enjoyed it. Nix provides tools with cool features. In particular, Nix is able to compute all runtime dependencies automatically for us. Not only shared libraries, but also referenced executables, scripts, Python libraries etc..

This makes packages self-contained, because we're sure (apart data and configuration) that copying the runtime closure on another machine is sufficient to run the program. That's why Nix has one-click install, or reliable deployment in the cloud. All with one tool.


Next pill


...we will introduce nix-shell. With nix-build we build derivations always from scratch: the source gets unpacked, configured, built and installed. But this may take a long time, think of WebKit. What if we want to apply some small changes and compile incrementally instead, yet keeping a self-contained environment similar to nix-build?

Pill 10 is available for reading here.

To be notified about the new pill, stay tuned on #NixPills, follow @lethalman or subscribe to the nixpills rss.