A new (remote) store protocol, go-nix updates

This article describes a proposal for a new Nix remote store protocol I’ve been thinking about for a while. It also gives an update on recent improvements in go-nix, as well as an outlook on what’s next.

A new (remote) store protocol

Shortcomings in the current format

As already written in a previous article, the NAR file itself isn’t quite nice as a wire format:

  • It doesn’t provide an index, so if you only want to access one file inside a big NAR file, right now you still need to download the whole archive, then seek through it until you’re at the file you initially requested.
  • While nix copy has an option to write .ls files alongside the NAR files, it feels very bolted on, and doesn’t have any trust root.
  • Even with .ls files and range requests, there’s no way to know the content hash of the chunk you want to download, so you can’t know if you already have it elsewhere locally.

While thinking about how substitution between a nix-casync binary cache and a local client should look like, I quickly realized what I’m really looking for was a generic remote store protocol, so all the improvements could be used in Tvix, and other projects.

I wanted to increase the metadata about a store path (so the index/list of files is included in the metadata), and each regular file can refer to a list of content-addressed chunks, allowing chunk substitution to become a much more out-of-band mechanism.

I also disliked the fact that uploaded .drv files are somewhat treated the same way as store paths, with a $drvHash.narinfo file, and a NAR file containing the literal ATerm contents.

A new proposal

My current proposal uses a Manifest structure, on a Derivation (not per-output) granularity.

I brainstormed on various versions of this with a bunch of people (thanks adisbladis, andi-, edef and tazjin!).

In its current form, the manifest structure contains the following data (some of them being optional, TBD):

  • The derivation path
  • The derivation content
  • A list/map of outputs

Each output contains the following information:

  • The name of the output
  • A list of references to other store paths
  • A listing of all the elements in the output.
    • Each regular file can contain a list of chunks.
      • Each chunk is identified by its hash. It also contains some metadata on the hashing algorithm used, and it’s size (so we can seek into files).
  • (TBD, see further down) A list (narinfo-style) signatures, NarHash and NarSize.

Actual chunk substitution happens out-of-band.

This design has a bunch of advantages:

  • Assuming there’s some sort of local chunk cache, individual chunks that are already available locally can be re-used.
  • Store paths without any regular files inside (symlinks etc.) don’t need any chunk downloads at all
  • Substitution doesn’t care about the exact chunking mechanism used. We can start with one chunk per file, and use different chunking mechanisms as we go.
  • Because chunks are content-adressed, they can trivially be substituted from anywhere, not just the binary cache that’s asked. This allows zero-trust gossip-style substitution from local network peers, IPFS or not-need-to-trust CDNs.
  • As chunks are independent from each other, they can be requested in a much more parallel fashion, allowing a higher substitution throughput for high-latency networks, or networks with slow per-stream throughput in general.

Signature mechanisms, trust

Right now, the protocol doesn’t specify any signature mechanism, except maybe storing the existing narinfo-style signatures. However, as that requires assembling (and possibly subsituting the whole NAR), we might want to come up with a better signature scheme in the longer run, that doesn’t require subsituting all chunks.

Another option would be to simply require a proper HTTPS connection to the backend serving the Manifests. As can be seen with Cachix, people seem to be fine delegating the signing part to their binary cache.

However, signatures make sense for things like Trustix, so I’m in discussions with adisbladis on this.

Remote protocol

So far, we only talked about the structure used to store metadata, not about the actual methods used to query a remote binary cache. There’s a lot of potential for optimization on the query side; similar to git’s “smart protocol”, substituting clients (which evaluated locally) could signal which manifests they already have, and get back all missing Manifests in one request, which should dramatically reduce roundtrip time.

We plan to experiment here a bit as we go, and more of this is implemented.

More go-nix work

dedup improvements by chunking on file boundaries

Since the last blogpost, NinjaTrappeur made some more analysis on nix-casync and other substitution mechanisms.

One of the conclusions for me out of it was that even when just taking individual files of store paths, there’s significant deduplication potential.

This means, it’s somewhat likely the naïve approach used in nix-casync to feed the entire NAR file to the chunker is a good start, but it probably doesn’t properly “cut” at file boundaries, and if we feed individual files inside a store path to the chunker, we’d be at least as efficient as the per-file hashing approach discussed in the article, if not much better.

One main reason on why I initially fed the whole NAR file to the chunker was that the current binary cache protocol is serving NAR files, and go-nix didn’t have a way to produce NAR files.

This motivated me to contribute a NAR Writer to go-nix, and also increase test coverage of the writer and reader (and disallow some odd corner cases we previously missed)

Now we can read in a NAR file, decompose it, chunk and mix it in a blender, and afterwards put it back together, restoring the same NAR contents byte-by-byte.

This is useful for things like a local translation proxy, speaking the new protocol to a remote, but still exposing .nar/.narinfo to the local Nix Daemon, or for compat interfaces in general.

more go-nix work

While brainstorming on these concepts, adisbladis and I have been contributing a lot of basic Nix concepts to go-nix.

Apart from the NAR Writer and improved test coverage, the following features were added:

  • A parser for .drv files, and functions to (re-)calculate derivation paths and output hashes, as well as a writer (and JSON [de]serialization)
  • An interface of a “Derivation store”, which can hold a graph of Derivations.
    • An implementation using badger for underlying storage
    • An implementation providing a view into a local path containing derivations (so parsing .drv files from /nix/store becomes possible)
    • An implementation requesting derivations via HTTP (think about serving /nix/store via a python3 -m http.server)
  • A lot of fine-tuning w.r.t. number of allocations and benchmarking in general has been done to make things fast.

What’s next?

There’s more things in the works, like an implementation of a Builder using OCI, and Nix Stores using the above mentioned protocol.

I’m also planning to slowly move some of the other concepts from nix-casync (store path substitution, the current Nix HTTP binary cache interface) into go-nix, once interfaces have settled a bit.

Feedback and Contributions welcome!

There’s now also a Matrix Channel1, that’s used for communication. Feel free to join!

  1. #go-nix:matrix.org ↩︎