This article describes a proposal for a new Nix remote store protocol I’ve been thinking about for a while. It also gives an update on recent improvements in go-nix, as well as an outlook on what’s next.
As already written in a previous article, the NAR file itself isn’t quite nice as a wire format:
- It doesn’t provide an index, so if you only want to access one file inside a big NAR file, right now you still need to download the whole archive, then seek through it until you’re at the file you initially requested.
nix copyhas an option to write
.lsfiles alongside the NAR files, it feels very bolted on, and doesn’t have any trust root.
- Even with
.lsfiles and range requests, there’s no way to know the content hash of the chunk you want to download, so you can’t know if you already have it elsewhere locally.
While thinking about how substitution between a nix-casync binary cache and a local client should look like, I quickly realized what I’m really looking for was a generic remote store protocol, so all the improvements could be used in Tvix, and other projects.
I wanted to increase the metadata about a store path (so the index/list of files is included in the metadata), and each regular file can refer to a list of content-addressed chunks, allowing chunk substitution to become a much more out-of-band mechanism.
I also disliked the fact that uploaded .drv files are somewhat treated the same
way as store paths, with a
$drvHash.narinfo file, and a NAR file containing
the literal ATerm contents.
My current proposal uses a
Manifest structure, on a Derivation (not
I brainstormed on various versions of this with a bunch of people (thanks
In its current form, the manifest structure contains the following data (some of them being optional, TBD):
- The derivation path
- The derivation content
- A list/map of outputs
Each output contains the following information:
- The name of the output
- A list of references to other store paths
- A listing of all the elements in the output.
- Each regular file can contain a list of chunks.
- Each chunk is identified by its hash. It also contains some metadata on the hashing algorithm used, and it’s size (so we can seek into files).
- Each regular file can contain a list of chunks.
- (TBD, see further down) A list (narinfo-style) signatures, NarHash and NarSize.
Actual chunk substitution happens out-of-band.
This design has a bunch of advantages:
- Assuming there’s some sort of local chunk cache, individual chunks that are already available locally can be re-used.
- Store paths without any regular files inside (symlinks etc.) don’t need any chunk downloads at all
- Substitution doesn’t care about the exact chunking mechanism used. We can start with one chunk per file, and use different chunking mechanisms as we go.
- Because chunks are content-adressed, they can trivially be substituted from anywhere, not just the binary cache that’s asked. This allows zero-trust gossip-style substitution from local network peers, IPFS or not-need-to-trust CDNs.
- As chunks are independent from each other, they can be requested in a much more parallel fashion, allowing a higher substitution throughput for high-latency networks, or networks with slow per-stream throughput in general.
Right now, the protocol doesn’t specify any signature mechanism, except maybe storing the existing narinfo-style signatures. However, as that requires assembling (and possibly subsituting the whole NAR), we might want to come up with a better signature scheme in the longer run, that doesn’t require subsituting all chunks.
Another option would be to simply require a proper HTTPS connection to the backend serving the Manifests. As can be seen with Cachix, people seem to be fine delegating the signing part to their binary cache.
So far, we only talked about the structure used to store metadata, not about the actual methods used to query a remote binary cache. There’s a lot of potential for optimization on the query side; similar to git’s “smart protocol”, substituting clients (which evaluated locally) could signal which manifests they already have, and get back all missing Manifests in one request, which should dramatically reduce roundtrip time.
We plan to experiment here a bit as we go, and more of this is implemented.
Since the last blogpost, NinjaTrappeur made some more analysis on nix-casync and other substitution mechanisms.
One of the conclusions for me out of it was that even when just taking individual files of store paths, there’s significant deduplication potential.
This means, it’s somewhat likely the naïve approach used in nix-casync to feed the entire NAR file to the chunker is a good start, but it probably doesn’t properly “cut” at file boundaries, and if we feed individual files inside a store path to the chunker, we’d be at least as efficient as the per-file hashing approach discussed in the article, if not much better.
One main reason on why I initially fed the whole NAR file to the chunker was that the current binary cache protocol is serving NAR files, and go-nix didn’t have a way to produce NAR files.
This motivated me to contribute a NAR Writer to go-nix, and also increase test coverage of the writer and reader (and disallow some odd corner cases we previously missed)
Now we can read in a NAR file, decompose it, chunk and mix it in a blender, and afterwards put it back together, restoring the same NAR contents byte-by-byte.
This is useful for things like a local translation proxy, speaking the new protocol to a remote, but still exposing .nar/.narinfo to the local Nix Daemon, or for compat interfaces in general.
Apart from the NAR Writer and improved test coverage, the following features were added:
- A parser for
.drvfiles, and functions to (re-)calculate derivation paths and output hashes, as well as a writer (and JSON [de]serialization)
- An interface of a “Derivation store”, which can hold a graph of Derivations.
- An implementation using badger for underlying storage
- An implementation providing a view into a local path containing
derivations (so parsing .drv files from
- An implementation requesting derivations via HTTP (think about serving
python3 -m http.server)
- A lot of fine-tuning w.r.t. number of allocations and benchmarking in general has been done to make things fast.
There’s more things in the works, like an implementation of a Builder using OCI, and Nix Stores using the above mentioned protocol.
I’m also planning to slowly move some of the other concepts from nix-casync (store path substitution, the current Nix HTTP binary cache interface) into go-nix, once interfaces have settled a bit.
There’s now also a Matrix Channel1, that’s used for communication. Feel free to join!