Introducing nix-casync, a more efficient way to store and substitute Nix store paths

I was thinking about how to improve Nix binary cache substitution and storage requirements. In locations with spotty internet connectivity, for example while travelling, upgrading my system to the latest version, or downloading something via nix-shell proved to be challenging.

Even simple fetches can take minutes, while most of the contents should already exist somewhere on my local system.

I had some ideas about how to use the casync mechanism to only transfer/store the blocks that have changed. Now, during the NixOS Oceansprint 2021, I finally found the time to write a first implementation.

This blog post introduces nix-casync, a HTTP binary cache using the casync mechanism internally to efficiently store NAR files in a deduplicated fashion, and provides an outlook on how to use it to speed up substitution.

NAR (Nix Archive) and Narinfo files

Let’s have a quick look on how the Nix binary cache usually looks like:

Narinfo files

The lookup of store paths (/nix/store/${outputhash}-${name}) in the binary cache involves fetching a ${outputhash}.narinfo file for each path.

This Narinfo file contains some basic metadata about the store path in question, such as: name, checksums, references, and usually signatures.

It also specifies the path to a NAR file holding the contents of that store path.

NAR files

NAR files are a binary blob, similar to .tar, but with very little metadata. They only store the file attributes that can appear in a Nix store (no timestamps, only executable bit/symlink information).

Usually, in binary caches, these NAR files are placed in a location named after the hash of their contents (nar/$narhash.nar), thus making them a content-addressed NAR storage. Optionally, those can be compressed.

Multiple Nix store paths can have the same contents. In these cases, different .narinfo files point to the same NAR file, providing some basic deduplication.

However, this only works for the simplest cases. In reality, most Nix store paths contain references to other Nix store paths, and a single changed bit will cause the whole NAR file to not get deduplicated at all.

This will propagate to everything referencing that store path, leading to very bad deduplication for world rebuilds.

This basic deduplication also only helps reducing storage requirements - it doesn’t prevent NAR files from being downloaded again. As of now, on substitution, Nix doesn’t really recognize it already has the same contents somewhere else in the store, and [downloads them again][nix-issues-5756].

All in all, right now, the binary cache format is pretty wasteful, both when it comes to on-disk storage, as well as substitution of store paths.

How to fix this?

Basically, the content-addressing scheme should be performed on smaller chunks of data.

Guix started experimenting with hashing individual files, and content-addressing them, but this would still bust deduplication on trivial differences in individual files.

By chunking the whole stream, and hashing it with a rolling hash algorithm, it should be possible to get even better deduplication rates.

This is the mechanism tools like casync use. People interested in the details might want to read Lennart’s blog post which describes the format in much more detail.

Building a new binary cache

In addition to S3 and the local filesystem, Nix supports uploading to a HTTP binary cache (essentially) implementing the following interface1:

HTTP PUT /$nixhash.narinfo
HTTP PUT /nar/$filehash.nar

nix-casync implements a HTTP frontend providing this interface (plus the GET/HEAD bits to get files out of it).

Internally, for each uploaded NAR file, it chunks the payload, adds a caibx2 file to the “index store”, and puts new chunks into a “chunk store”, consisting of (zstd-compressed) chunks.

When the same NAR file is requested for download, the caibx file is requested from the “index store”, and all chunks referred in there reassembled from the “chunk store”.

Under the hood, it uses desync, a pure Golang casync library. Kudos to @folbricht for this library!

Narinfo

Right now, .narinfo files are simply stored in another directory. Support to move this to something smarter, to facilitate garbage collection is planned.

As the above chunking mechanism still ends up serving the same NAR files, existing signatures can be preserved 3.

Speeding up substitution too

The above mostly described how NAR files could be stored on disk more efficiently, but there’s no need to stop there.

What if substituting clients could also make use of these chunks, instead of having to download on-the-fly-reassembled NAR files?

This is totally possible - The casync docs already describe how such a HTTP endpoint would look like.

To add support for that, a remote binary cache would simply need to expose the index and chunk stores, and some client-side component could assemble the NAR file locally, keeping downloaded chunks in a local cache and reusing them for future substitution.

Right now, the plan is to simply add substitution to nix-casync, with an “upgrade” to also use the index and chunk stores if the remote site supports it.

You’d just deploy another, “local” nix-casync on your laptop, which is configured to substitute from the “remote” nix-casync. In the future, once things have been adopted more widely, that kind of substitution could even be implemented into Nix itself.

Local discovery / substitution from the local network

Because the chunks are fully content-addressed, other distribution mechanisms for them could be imagined.

One idea is having nix-casync announcing its chunk store to other instances on the same network, so machines could use each other for substitution.

As the only thing that needs to be substituted are content-addressed chunks, this requires zero trust in these machines. The substituter simply needs to verify each chunk to have the expected checksum.

Current status

So far nix-casync implements casync-style chunking and reassembly of NAR files, and also stores Narinfo files persistently.

This already makes it a storage-efficient Binary Cache.

There’s still a lot of things on the TODO list - have a look in the issue tracker.

If you want to help out with any of that, there’s a project Matrix Channel4.

Please reach out!


  1. Compression is left out of the interface intentionally, as the payload needs to be decompressed before chunking anyways, and HTTP transfers can be transparently compressed in-band using the Content-Encoding header. ↩︎

  2. A chunk index file, describing how to assemble a blob. ↩︎

  3. This even applies to (re/de)compression, as only the store path, (uncompressed) NarHash, NarSize and References are what’s making up the .narinfo signature. ↩︎

  4. #nix-casync:matrix.org ↩︎