February 6, 2024 42:07 E1

Fileset library and Flakes with Silvan Mosberger

Intro: Hello, and welcome to the Nix podcast. This is our first episode. Our guest is Silvan Mosberger, who is a longtime Nix contributor. He works at Tweag, and he's also the host of the "Nix Hour" podcast. Our host is Shahar "Dawn" Or.

Shahar: Welcome, Silvan. How are you?

Silvan: I've been busy doing a lot of things with Nix recently. What I'm most excited about is the file set library that I've been working on.

Shahar: File set library. Okay, I'll bite. So what is it?

Silvan: Maybe let's look at the problem first. In Nix, you want to build projects and you might have a local project, so you need to create a derivation that builds stuff from some of the files from your source directory. But you might have a lot of garbage in your source directory. And you want to filter that out; firstly, so that Nix doesn't import it into the store, but also to avoid unnecessary rebuilds. And so previously, you would have used the `builtins.path` function, or some wrapper around that, but all of them are hard to use, I think. The file set library tries to make that easier and safer.

Shahar: Why didn't I ever care about this? And who would want to care about this?

Silvan: If you have a big monorepo, then things get a bit nasty with Nix. For example, let's say you import the entire directory; then, whenever you change any file in that repo, you have to rebuild all the derivations because Nix thinks, "Oh, well, the change might influence the build. So better try it again". Even though you might have subprojects that are entirely independent of each other. The classic example is when changing the README in your repository suddenly causes a full rebuild. Compare that with, say `fetchgit`: here, at least all the Git-ignored files are filtered out, which is already useful.

Shahar: You don't use your README as source code?

Silvan: Yeah, not yet, although I've heard some ideas. Anyway, the file set library aims to address that issue in a simple, intuitive way. The ideas underlying the project are themselves fairly basic: "file sets" are sets of files, as in set theory from mathematics. The library implements the classic set operations, like union and intersection and sets difference. You can learn these concepts in like a couple of minutes and it's fairly intuitive. For example, the union of two sets is a set containing the elements that are in either sets (or both); for the intersection, only the ones that both sets have in common.

Shahar: And this is implemented in Nix?

Silvan: Yeah, yeah. So it's, so there are-

Shahar: Well, what is that smirk our viewers cannot see?

Silvan: Nix wasn't made for this, to be honest. Nix was made to build packages and it's only incidental that you can even implement this math stuff on top of Nix. You need to rely on certain Nix built-ins: for example, Nix can read the contents of a directory and output information about the files in there, and there is a basic operation for filtering files. But yeah, if you can, why not?

Shahar: Speaking of reading files, what does all this mean with regard to purity?

Silvan: That's a good question. Traditionally, the ways to do filtering, such as `builtins.path` or some other utilities, were violating purity somewhat. `builtins.path` has a `filter` function argument, that gets the path to a file and returns true or false; it encodes the question "Should I include this file or not?". The problem with this approach is that the entire path is passed, including things like `/home/infinisil/`, so your filter results can change based on which user runs your commands. That's of course very impure.

Shahar: It sounds like an existing Nix problem with this particular API, right?

Silvan: Actually that's one of the things Flakes helps with in particular. With Flakes, whenever you evaluate something locally, it imports the entire directory into the store, which can be a bit dangerous because it might import too much. However, then it doesn't depend on the user's name or anything like that anymore, everything is just `/nix/store/`. So that helps with that, although Flakes are still fairly experimental.

Shahar: What exactly is experimental?

Silvan: Let's talk about this a bit later; I'll just finish the thought here. The file set library, by design, doesn't allow you to refer to things like the user's name or things like that. You simply don't have the ability to refer to the entire path of files. That makes sure that it works regardless of where the command is run, whether you use Flakes or not.

Shahar: Hmm.

Silvan: So file sets work out fine regarding purity.

Shahar: What are you using file sets for?

Silvan: Well, I'm not really doing a lot of packaging myself. I'm mainly working on interface changes to Nixpkgs that hopefully can help a lot of users. And so this, this was motivated by this existing `builtins.path`-like filtering which was really tricky to get right. And hopefully we can deprecate that interface at some point; it's really too hard to use. I think no one should have to use that, unless they're writing a file set library, I guess.

Shahar: Well, thank you for sharing.

Silvan: Yeah. But you mentioned "experimental". Should we talk about Flakes? Should we open that can of worms?

Shahar: This is the exact correct place to open cans of worms.

Silvan: Okay. That's great. Well, Flakes, I have some opinions on them and I guess when you come across me, you might think of me as, "Oh, Silvan really dislikes Flakes," but that's not entirely true. Flakes has benefits, and it has problems. One benefit is its really nice user interface. I got to give it that. Honestly, the old commands just kind of suck and it improves on that. Maybe I need to qualify that a little more. The new CLI, which is also experimental right now, comes in two flavors: non-Flake and Flake. So right now you have three separate CLIs to use Nix: the stable, old CLI, the experimental non-Flakes CLI and the experimental Flakes CLI.

Shahar: The last update I remember reading from the Nix team is that the candidate nearest to getting stabilized is the new non-Flakes CLI.

Silvan: Exactly, yeah. So this was actually an RFC, number 136. I was a shepherd for that one. The proposal was essentially: "Let's start stabilizing the CLI first, and then we'll think about Flakes". And that is simply because I think the CLI is much easier to reason about. You don't introduce any new concept with the CLI. You just make the user interface nicer and it should continue to work with the normal Nix way of doing things. Flakes is a much bigger and more controversial topic, but it can thought of as entirely orthogonal to the CLI. And so I've also been a bit involved in the Nix team recently; I'm not officially on the Nix team, but the meetings are fairly open so I've been joining some meetings and listening in a bit and giving some opinions as well. I've seen them try to focus on stabilizing things. And yes, they are working on the CLI. They have a tracking issue where they list all the new CLI commands and they redesign them every once in a while and say "This command doesn't really look that good. It should work a bit differently. It's not really consistent that way. And we need to make this change and that change". So that's been going on, but also stabilizing Flakes, and the first item to stabilize in Flakes is planned to be `fetchTree`.

Shahar: `fetchTree`, is that an internal thing? I don't know, I've never heard of this.

Silvan: It's internal. Flakes generates these lock files. And the way these lock files get generated using `fetchTree`. Thus, stabilizing `fetchTree` should lead to stabilizing the lock file format fairly directly. So I started looking into `fetchTree`, because I heard it's gonna be stable in the next release; initially, that was the notion. But then I looked at it and it was like, "I don't really know what is supposed to be stable now". `fetchTree` supports different input types: you can invoke `fetchTree` with `type = "github"` or `type = "git"` and such.

Shahar: Input schema.

Silvan: Right, exactly. The input schema gets, I think, directly passed to `fetchTree`, so that would also stabilize the input schema. So I looked a bit at `fetchTree` and first, I found the docs are quite poor and I couldn't really understand anything there, and then I found my answer to what was supposed to be stable now: the answer is some input types were supposed to be stable, like HTTP, Git and such. The others weren't supposed to be stable but you couldn't really see that in the documentation. I think stable interfaces should be documented and tested; if you don't have that, then it really doesn't feel like stability is guaranteed. With the file set library, I've really made sure that everything was properly tested and documented. And I can proudly say that the file set library is stable and it's gonna be released in the next NixOS release, which is next week, very soon. I'm gonna make sure that there aren't gonna be any breaking changes and, that's gonna be fairly easy because there's a big test suite that has every single feature of the library. I really think the same thing should also be done for Flakes features. Flakes feels like a 80/20% thing where the easy 80% is done, and that's great, but there remain 20% of hard bits that no one wants to do. And there's little resources to get that done.

Shahar: How do you see us getting out of this situation?

Silvan: There are some ideas. On one hand, the Nix team, I think, so far, is of the opinion that, "We don't have a lot of resources, but Flakes needs to be stabilized. People should ideally write docs and maybe rethink the interface a bit, but if no one does that, then we're just going to stabilize it anyway." So there's this timer ticking now where `fetchTree` is planned to be stabilized in a month, no matter what the interface looks like and whether it's documented. There is some documentation in the PR now, although it's not yet merged. So that's a way to do things. Another possible way is to get the resources we need: the Nix team has been looking for funding, to pay someone to actually work on these issues. The Nix team itself mostly does triaging work: the new Nix issues that pop up every day need to be sorted, accepted, rejected, reviewed, and so on.

Shahar: We don't see the rate decreasing either.

Silvan: Yeah, exactly. Nix is getting more popular and the resources are getting more strained, so who is left to do Flakes stabilization? or maybe the Nix team should do Flake stabilization, but then who will do the triaging and so on? So hiring someone like the NixOS Foundation, getting someone to actually help out with the Nix team to do these tasks of backporting things, making bug fixes here and there, that would really help a lot, I think. That's currently in discussion in the NixOS Foundation, I believe.

Shahar: Popular technologies, they eventually gain sponsors. They're significant enough to sponsor what the software needs, which is a team that is working on it full-time for a while, or some significant portion of full-time.

Silvan: I'm at Tweag and I work partly for a client that's Antithesis and they essentially enabled me to do all of this community work. I'm really *only* doing Nix community work. And I think it's really beautiful that I can do that and that there is money to go into these things. I'd like to help out the Nix team. I guess my C++ isn't the strongest, like barely existing, but I think I'm skills would help there. But also I'd be happy for other people to jump in and take a look at the issues. Really, the whole Flakes thing was a bit botched from the beginning. Maybe I'll give a brief history. Flakes started with an RFC, I think RFC 49, by Eelco himself. It got a lot of feedback. It was a huge thing, even initially. Obviously this kind of thing is fairly controversial. And then at some point, I think Eelco just decided, "Well, I'm gonna close the RFC and I'm gonna merge it as experimental". And so it's experimental, it's not stable yet: people shouldn't be able to rely on it yet, as it can change at any time. By now, we know the story: four years later, Flakes are still marked as experimental, and at the same time, almost nothing changed with them.

Shahar: What is the most breaking change to Flakes since then?

Silvan: I believe that the only thing that changed was the output schema at some point, although there might be more. Initially, I think `defaultPackage` was to be used as a single identifier, and now it's `.default`. But that had a deprecation warning, so it's not technically a breaking change. I think it still works with the old one. So yeah, really not a lot happened there. If you have an experimental feature, you need to regularly break it. Even if nothing actually changes, you need to make sure that people know that it's experimental. We should be able to make changes to it and you shouldn't start relying on it. But now we're here today and everyone is relying on Flakes. There are products built on top of Flakes. There are installers that enable Flakes by default. Tutorials and books are getting written about Flakes even though it's still marked as experimental.

Shahar: It's the chasm between some features that we need and the development time that goes into Nix itself.

Silvan: Yeah. Maybe Flakes stays the way it is and people use it and that's good. But I think there are some fairly fundamental problems with Flakes as it is now. And these problems don't have to be there. So if it continues like this, I think at some point there's gonna be another thing, something that's not Flakes, maybe from somewhere completely different, that addresses these issues. Imagine you have these two products. On one hand, you have Flakes with these issues. On the other hand, you have this better-designed thing. In the long run, which is more likely to succeed? Well, the design isn't necessarily the deciding factor for success, but personally I'd pick the one that was better designed. That's what pulled me to Nix in the beginning. I started using Nix six years ago, moving from MacOS to NixOS. With MacOS, everything was constantly being broken. I didn't really know what went on in my system. There were services running that I couldn't stop. And it was just an opaque box. And also, package updates. That's the main thing that Nix helps with. And so that's what pulled me to Nix. I realized that Nix has a really nice design. This idea of store paths, it works very well. It makes stuff a lot cleaner. Okay, sure, the user interface sucks, but we can improve the user interface; the underlying model is gonna stay the same. And I think with Flakes, it's the other way around: the underlying model is not very nice. There are some pretty big problems there, though they can be fixed to a degree, but only the user interface looks pretty good. And once you dig a bit more into the depths, you come across the slightly ugly bits. I also want to mention some other things that are going on in the Nix community: Flakes is an interesting and controversial topic, but Nix is so much more than just Flakes. And even with Flakes, you still rely on all of Nixpkgs, on all of the packaging approaches, all of that. I have a long list of things going on here and there. For instance, there's the Nix Hour, which I'm doing every week, where in each episode we tackle one or more technical problems related to Nix: we try to package something, we try to fix a build, we write some modules, etc.

Shahar: Sounds great. There will be a link.

Silvan: That has happened every week, since last year's NixCon. Just yesterday, we recorded the 53rd episode. It's fairly consistent and going well. Other things, let's see... One interesting thing is the Nix Formatting RFC. So two or three years ago, there was an RFC opened, I think it was number 101, which aimed to standardize Nix formatting so that we wouldn't have to debate how to format Nix code, but it didn't go anywhere. So about a year ago, I started regularly scheduling meetings with the team there. And finally, after almost a year, we now have a new RFC open, which I think looks much better and much more doable. That's RFC number 166. If you have feedback or want to take a look at it, feel free to.

Shahar: Everyone who's been around code has been around formatting. I had the privilege of trying a few formatters for a few languages. One question I have about this RFC is, I don't know if I'm looking for the term deterministic; does the proposal say "For this code, there's only one way to format it"?

Silvan: It is not too far away, but not completely. For example, a lot of people like to put new lines in between attribute declarations and group them together somehow. We don't wanna get rid of those new lines or comments, so we can't be totally deterministic in that way. I think there might be some other minor cases like that, but other than that, I think it's fairly deterministic.

Shahar: So it does preserve some white space.

Silvan: Yes. Considering formatting is a lot of bike-shedding, as we've seen in the previous RFC and in the past year in all of the meetings we had, we're pretty happy with the result. We're now four people on the team that proposed the new RFC. We all agreed on this new formatting. We also have a lot of people just generally saying, "Whatever, let's just have a formatter. Let's get this debate over with, so we don't have to talk about it anymore."

Shahar: Commas at the end of the line?

Silvan: Commas at the end of the line, yes.

Shahar: Thank you. Thank you for that.

Silvan: Yeah, that's one of the core discussion points. And it was brought up again in the new RFC as well. Most people prefer the comma at the end of the line, and it also has benefits. So anyway, I think that we are in a pretty good position for the RFC to succeed. By the way, we need one or two more shepherds for the RFC. Anyone can be a shepherd; it's good to know a bit of Nix though. The RFC process is a bit weird when a group of people works on it: we had four people work on that RFC for the entire year; only two of those people, me and [@piegames](https://github.com/piegamesde) actually wrote most of the text: the question is, who is the author of the RFC? [@piegames](https://github.com/piegamesde) did most of the work. So they are the author now. I've also contributed a lot of the RFC text and so I'm co-author. The others reviewed the work and joined meetings fairly often. So we call them pre-RFC reviewers. The reason we do it this way is because shepherds for an RFC, of which you need at least three, can't also be authors of that RFC. That means if you have a team of four authors that are already interested in the topic, then you still need three more people, so seven people in total, just to start the RFC process. Finding seven people in a volunteer-based community is really hard.

Shahar: All the possible candidates are already up to their necks in contributions and some other RFC and some other team.

Silvan: Yeah, if you're involved in the Nix community, there's no way to find a single thing to work on: it's always a wide range of things. At least for me, that's the case. I only talked about half the things I'm involved with right now. I've recently become involved in the RFC steering committee. I've been in the documentation team for one or two years. I'm the architecture team lead. I'm working on RFC 140, which does some stuff in Nixpkgs. All of these things need attention. I wish I could clone myself 10 times over, that might help a bit.

Shahar: Well, things will move at the pace at which people are willing to contribute and that will be influenced by the pace at which sponsors are willing to sponsor. And that will be influenced by the usage as well.

Silvan: Yeah. There's this idea of, "we don't want new users. We want new contributors." Because users just use and don't contribute back. I don't think it's that bad: it's fine for a lot of people to use it. But if the ratio gets out of balance, like if for every 10,000 users, we get just one contributor, then the 10,000 users might file a thousand bug reports every year and the one contributor can't fix that. So there needs to be some balance in the ratios.

Shahar: Well, this is open source and it's not unique to Nix and the world is still figuring this out.

Silvan: I feel like Nix has the right idea. I feel like it's only a matter of time until everyone uses Nix directly or indirectly somehow. Maybe not on their phones, since Nix is not really used for general computing, but probably your phones will be deployed using Nix or some variation of it at some point. In a hundred years, Nix will probably have changed a lot, or I mean, Nix will probably have disappeared, but something else with the same idea is gonna be there. So yeah, I'm happy. I'm looking forward to the future. To see what it brings.

Shahar: I'm sure that we'll be a part of it. Right. We're at our time box, I suppose. So thank you for sharing.

Silvan: Yeah, no problem. It was fun. And thanks for hosting that.

Shahar: You're welcome, Silvan. And I hope to have you again.

Creators and Guests

Host

Shahar "Dawn" Or

Author of the Full Time Nix podcast and open source contributor

Guest

Silvan Mosberger