-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Any interest in a new directory hashing format? #872
Comments
I wonder if https://github.com/bazelbuild/remote-apis is a more appropriate place to raise this discussion. Because Buck2 is currently only represents the client-side implementation of a remote build protocol. To make a new protocol work, you would need a server-side implementation supporting such a protocol as well.
I think this is a fair criticism. The remote API spec does help clarify how the digest of a proto message MUST be computed to help tighten the rule a bit further.
I am a bit confused by this part.
I also don't understand this claim.
There are several discussions over in remote-apis about how to better represent trees in a future version of the protocol. Such as:
I think your problem is very interesting and it would be nice to get a concrete analysis to compare how a Radix tree would help improve things over the existing format. |
I'm happy to move it there if you prefer. I sort of think the broader venue might make it a tougher sell, especially for the sum-type stuff which will be confusing for those using languages that don't have it / a library-level best-attempt like
I see, the build side at Meta is either off-the-shelf or proprietary? In any event, Nix does both sides (dating back to when we didn't have a daemon or remote execution at all).
OK, glad at least that part made sense :).
Yes the "serializer should do the right thing" makes sense to me. But I would want a deserializer that is just as strict too. I am not sure more protobuf implementations support that as one step, but yes I support once can just reserialize to the canonical form, and see if that matched the input. It's not too hard.
Sure! First a disclaimer, I'm going to just assume all store objects are content-adddressed for now with Nix. (We also have "input-addressed" store objects, which really philosophically muddles things. But I would like to basically deprecate them, so it is not fantastical to ignore them.) An (entirely-content addressed) Nix store is semantically a Taking a step back, I think (but am not sure) that Buck 1 & 2, Bazel, Meson, CMake, going back to Make all in the tradition where output and input files are named at least partially according to user-chosen / user-meaningful rule/action names. Nix breaks from that tradition by "leaking" the storage model: all your "stuff" goes in the giant flat "store" directory. Our store object file names currently do have a name part, but this just muddies philosophical difference --- it would be a more "honest" design if the store object names were just (text-encoded) hashes, completely human-unreadable. Hope that helps!
Here's another way of looking at it: are we modeling file systems or "file system objects". When we model file systems, we know the root objects are directory; at least on Unix (single root) and Windows (multiple roots). So one usually gets a conceptual model like: struct Directory { ...FileSystemObject... }
enum FileSystemObject { ...Directory... } (where my Where we Merklize it (the type argument to struct Directory { ...Digest<FileSystemObject>... }
enum FileSystemObject { ...Directory... } (this is how I think of Git) or like this: struct Directory { ...FileSystemObject... }
enum FileSystemObject { ...Digest<Directory>... } (this is how I think of Bazel RBE's format) With Nix however, we're principally concerned with modeling not the entire store as a file system object, but modeling the contents of each file system object, which can equally be directories, or individual files, or individual symlinks. We thus don't need any mutual recursion at all. Our model is thus this (matching our docs https://nix.dev/manual/nix/2.26/store/file-system-object): enum FIleSystemObject {
File { executable: Bool, content: Vec<u8>, },
Symlink { target: String },
Directory { children: Map<String, FIleSystemObject> },
} Ignoring the radix trie part, the model I lean towards for a Merkle Dag is: enum FIleSystemObject {
File { executable: Bool, content: Digest<Vec<u8>>, },
Symlink { target: String },
Directory { children: Map<String, Digest<FIleSystemObject>> },
} This to me has two advantages:
I might add that the 2 rounds of hashing in the file case is not strictly necessary, but I think better than one round of hashing like git does (where they prefix the file with "blob"). I think Bazel RBE is right that the hash of the raw file needs to be part of the merkle DAG, otherwise you break compatibility with too many blob stores for no reason.
Yes maybe I should at least take the Radix trie part of this over to that repo. It is less exotic in terms of having no "please think with sum types" culture shock, and also less Nix-specific in that it is pretty easy to imagine random access of large directories as a problem in isolation. |
Oh, I forgot about |
(Note I am talking more with @edef1c, and seeing a concrete benefit in more-than-hash directory children because supporting the dirent In other words enum FileSystemObject {
File { executable: Bool, content: Digest<Vec<u8>>, },
Symlink { target: Digest<String> },
Directory { children: Digest<Map<String, FileSystemObject>> },
} might be best:
(Concretely, the last part is, to create a directory, you must provide enough info up front such that the That substantially winnows down my critique of the existing format, basically saying that except for unbounded metadata and the raw symlink target, the |
Yeah, I think you got a good hang of it now. So instead of using
with FileNode and DirectoryNode being pointers to other File and Directory messages in CAS. All fields in proto3 are treated as optional so this is effectively a list of enums as you wanted. I guess what's missing here is the "key" portion of the |
@sluongng Yes, I now am fine with the partitioned 3 My initial problem was less in doing the partitioning into 3 lists, but having the information to do so, since I wanted the file/dir/symlink type to be "behind the digest" of the child. Now that I don't want this info to be "behind the digest", for sake of For Nix's purpose, the type + digest, or the The remaining criticism is that it would still be nice for the content-address to be fixed-sized, however. That is where wishing the symlink target was also hashed comes into play. (The perms and other other metadata can I think be also arbitrarily large, but we can just reject those things when parsing, which we should anyways since they represent degrees of freedom outside our model.) (And also remaining is the serialized format canonicity issue and wide directory random access probably needing trie issue, but those are orthogonal.) |
We Nix devs can implement the Bazel RBE protobuf Merkle format in Nix, and we probably should because it is wide-spread in the ecosystem, but it has some flaws for our use-case that, IMO, make a new format worthwhile. I see now that Buck2 has the trait infra to potentially support many such formats, so I hope this is not ipso facto a ridiculous request :).
Here are the flaws and fixes:
The first obvious flaw with RBE is that it hashes protobuf. The extensibility is nice, but the non-canonicity is not (https://protobuf.dev/programming-guides/serialization-not-canonical/). This is especially important for Nix, where the content-addressing can extend to the choice of file names themselves. (How things are hashed can be exposed by determining file names based on that hash, in contract to content-addressing just been an implementation detail.) Something like https://github.com/diem/bcs might be more appropriate
I see that it inherits the flaw from git, which is that there are not single unambiguous hashes for files and symlinks.
This flaw IMO boils down the the lack of sum types in protobuf. One would like a directory to look something like
Map<FileName, Hash<EntryValue>>
, whereEntryValue
is on a file, executable, or symlink, possibly with other information (timestamps?, extra perms?). But such a structure is/involves tagged unions, which are awkward in protobuf, so instead they partition the entries into per-variant collections, and then we lose the single unambiguous child hash property.A practical ramification of have no cheap way to go
Map<FileName, Hash<EntryValue>>
to directory for Nix is that it makes it a little bit harder to describe store sandboxes. We would like to cheaply and easily say "build with a store containing just these root store objects and the references", which means going from:But with a format like Bazel's RBE, we can't just use digests for the above, but much use directory-entries, which are rather bulkier (more metadata) and not fixed sized (symlink targets). This is not insurmountable, but is less nice than working with hashes of that information.
Rust, and BCS, support sum types, again making for an easy fix to this.
A final thing that @edef1c has enlightened me about is using Radix trees for directories, to enable protocols which can utilize merkle inclusion proofs to securely allow looking up single directories without first verifying the entire directly object against its hash. This would be useful for very wide directories like our stores (more likely for whole system network booting at "deploy time", than the dependencies of any individual build step at "build time"). Implementing the radix trie correctly makes the format much more involved, and their is also the question of tuning the radix, but the netboot use case is quite compelling to me.
I suppose whether Meta thinks performant FUSE random access for very wide directories is worth the complexity is more of an EdenFS than Buck2 question.
The text was updated successfully, but these errors were encountered: