EPISODE 1889

[INTRODUCTION]

[0:00:00] Announcer: Modern software development is more complex than ever. Teams work across different operating systems, chip architectures, and cloud environments, each with its own dependency quirks and version mismatches. Ensuring that code runs reproducibly across these environments has become a major challenge that's made even harder by growing concerns around software supply chain security. Nix is a powerful open-source package manager that builds software in controlled declarative environments where dependencies are explicitly defined and reproducible. Its functional approach has made it a gold standard for reproducible builds, but it can also be difficult to learn and adopt. 

Flock is a company that builds on top of Nix with increased supply chain security and abstractions that streamline the developer experience. Michael Stahnke is the VP of Engineering at Flox, and formerly worked at companies including Caterpillar, Puppet, and CircleCI. He joins the podcast with Kevin Ball to talk about Flox, building on top of Nix, how reproducibility underpins software security, the concept of secure-by-construction, how deterministic environments are reshaping both human and AI-driven development, and much more. 

Kevin Ball, or Kball, is the Vice President of Engineering at Mento and an independent coach for engineers and engineering leaders. He co-founded and served as CTO for two companies, founded the San Diego JavaScript Meetup, and organizes the AI in Action discussion group through Latent.Space. Check out the show notes to follow Kball on Twitter or LinkedIn, or visit his website, kball.llc.

[INTERVIEW]

[0:01:51] KB: Michael, welcome to the show. 

[0:01:52] MS: All right, thanks for having me. 

[0:01:54] KB: Yeah, I'm excited to get to dig in. So, let's maybe start a little bit with you. So, can you introduce yourself, your background, and how you got to Flox? 

[0:02:03] MS: Sure. Yeah, my name is Michael Stahnke. I'm currently the VP of engineering at Flock. And I've been involved in packaging and automation for most of my career. And so, kind of what Flox does was a culmination of a lot of previous experiences. I had worked at a big enterprise, at Caterpillar, the construction company, running data centers and system administration, doing a lot of automation there. Eventually, I left that company to go work with some friends who founded a company called Puppet. 

[0:02:27] KB: Loved that back in the day. 

[0:02:29] MS: Yeah, it was pretty early on there. I really enjoyed it. Let's not have this bespoke automation. Let's have a framework for this and really leverage out at scale when you're working with thousands of servers per administrator, things like that. Really enjoyed that. did a lot of packaging, built a lot of things. And I ended up doing a lot of porting and packaging, I guess. 

And packaging has been my passion for software the entire time. I love putting bits into nice orderly things that other people can consume. I founded a package repository called EPEL, which was the Extra Packages for Enterprise Linux on top of Red Hat Enterprise Linux, in 2005. It was me and six other people that basically got together and decided we were all doing this within our own companies. Why are we not sharing this? It's not differentiating work. It's not competitive work. Let's just make it open source and go do it. And that was super cool. 

So, that was kind of what got me into Puppet. They wanted me to repackage a bunch of stuff. And they're like, "Do you know how to build Debian packages? I'm like, "No, but I'm pretty sure I can figure it out." And I did. And then, "Do you know how to build AIX packages?" And I was like, "Actually, yes, because I worked at Caterpillar." But things like that. And so, I did a lot of packaging and CI system building to validate all that. You can't run a cloud CI system. They barely existed at the time. And if they did, they certainly didn't have AIX, HP-UX, Cisco switches, Juniper switches, things like that. We had to build all of that ourselves. 

And eventually, CircleCI was kind of watching what we were doing, and they called and said, "Hey, do you want to run platform engineering at CircleCI?" And I kind of answered yes eventually. And we had the right conversations, and that was awesome. 

One of the things I really wanted to do when I went to CircleCI was have a SaaS experience, where, instead of waiting for somebody to adopt to the latest version and upgrade, and maybe they have two change windows a year where they make a change on their version of Puppet or whatever, it was just, "We can just ship." It's awesome. And we were shipping hundreds and hundreds of times a week, and that was super cool. 

And then eventually, it was, "Okay, I've learned a lot at this company. Let's go someplace earlier and go kind of figure out how do you build a business from the ground up?" And that's what I'm doing with Flox. And, I guess, that's how I got here. 

[0:04:17] KB: So let's talk a little bit about what Flox does to sort of set the scene. And then we can dive deep into some of these pieces. 

[0:04:24] KW: Yeah. Flox is kind of an SDLC product overall. And people say, "What does that mean?" And I'll explain that for a moment. We have a couple of foundational principles we really want to have, and that's reproducibility and secure software supply chain. But that starts with developers. You don't bolt that on at the end and do a scan and say, "Hey, what's in this thing? That's what the SBOM is, or that's what the software supply chain is all about." It's we want to secure it by construction. And so that starts with an awesome developer experience because you want developers to work a certain way and have a consistent way of working. 

What we really optimize for is cross-platform reproducibility. And so with Flox, you create this thing that we call a Flox environment, which is somewhat analogous to a Docker container, but not identical in a lot of ways. But I would say, mentally, if that's kind of where you want to map it for a little bit, it'll work for a little bit. But you make an environment that you want to go develop in. And so maybe you put your tools in there, you put your language ecosystem tools, maybe you have some other stuff. And then that's reproducibility. It's reproducible. 

And so anything that we put in that environment, we lock it for Linux and Mac on x86 and ARM. And so if you're on an M1 Mac developing all day, but your colleagues on a Linux x86 laptop, you can use the exact same environment and it will materialize with the exact same version. And to me, that's really powerful because it means that it's not, "Oh, when I brew install this, I get this version. And when I app install it on this Ubuntu box, I get this other version." And a lot of times it's fine, but in a lot of cases it's not. 

[0:05:42] KB: I mean, you just described one of my pain points at work right now. I've got the guy who's got his homebuilt box with his custom distro and all these other things. And then, here, I'm working on a MacBook Pro because I also have to do exec stuff. Right? 

[0:05:54] MS: Right. That's exactly the kind of problem that - it was one of the first problems that we really wanted to set out was like, "Well, why don't we just have a cross-platform package manager?" And so we leveraged this thing called Nix underneath. And if you're familiar with Nix, it's this giant open source project. It's one of the most active open source projects in the world, but it's also pretty complicated. 

And when I first saw Nix, I was like, "Oh, wow. A bunch of Haskell people got together and decided packaging just wasn't complicated enough." And so eventually, I kind of learned more about Nix and the power of a functional programming language to deliver packaging, but also deliver a whole bunch of other things. And as we learned about it, as the Nix community kind of adopted, it was, "Well, this is really cool." But there was a pattern that we kept seeing that any company that tried to adopt Nix, there was kind of this single point of failure. It's like, "Well, the Nix person left the company, or the next person got promoted into a different department, and now we don't know what to do." 

And it was like, "Okay. Well, so maybe this technology is a little too complicated or a little too academic for the average business to go adopt and use." And there are some that are very successful with Nix in the raw form, not most. And so we decided that we were going to build some units of work that were more approachable from an enterprise. And that's what Flox really is. It's units of work on top of Nix that are reproducible that can be shared for developers, that can have optimized CI, and then have a runtime where you have a complete secure-by-construction build of materials all the way through. 

[0:07:12] KB: Okay. So I want to break down each of those pieces a little bit. I am not personally super familiar with Nix. And I suspect a lot of the audience is not. Since you're building on top of that and making it more accessible, let's start with like what does Nix actually enable for folks in particular? You mentioned pieces of it, but let's like spell it out. 

[0:07:29] MS: Yeah. I guess I'll talk about it more from principles probably than actual implementation just because, one, I'm not super familiar with all the implementation. I can use it very powerfully. I don't contribute to making significant changes within Nix itself. There's this thing called Nix store that is the most important thing for all the packaging systems. It's like an in/nix on a Nix-running computer. And what you do is you put all the software there. And all the software goes in there with linkage against everything else. It's a rolling release in that every day there's new stuff merged. More often than every day. 

And that's one of the things that I would say businesses have trouble with is a rolling release, because that's like Gentoo, or Tumbleweed, or things like that. That's not a common thing for Rail or Ubuntu. And so there's this rolling release going on. But you put all the software in /nix, and then what you do is you basically - you can think of that as like a package warehouse, and then you build a view on top of that when you want to use certain software and enable it. Whether you put that into your path or you put that into other environment variables. 

And so imagine you have this giant database of software, and it's like, "Okay. Right now, I need coreutils and I need Python. And I'm going to go grab those out of this data warehouse, and those are what I'm going to put in path." Even though there might be a five different versions of Python in that data warehouse. There might be 17 different versions of coreutils or whatever. I'm going to select the ones I want, and that's what's going to be enabled in my environment right now. 

And that's the way that Nix works. You can have that loaded into a shell. You can have it loaded into a developer shell, things like that. And when you exit that, it's gone. You haven't polluted your environment. You haven't overridden the system version of Python or the system versions of coreutils or whatever. These are completely side by side. And I think that's glorious. 

[0:09:02] KB: It's giving you a system-wide form of a virtual environment similar to how somebody in Python might be familiar with. 

[0:09:08] MS: Yeah. 

[0:09:09] KB: Okay. Interesting. 

[0:09:09] MS: Yeah. And the other thing is, with the way that Nix and us Flox work, is you kind of you have a superset of what that language package manager is. In a package.json, If you're writing JavaScript, you can specify any library that's on npmjs.org and it just works, or a GitHub repository that's got a package.json. The problem is, eventually, some of those have native bindings. Like, do they build against ImageMagick? Do they build against libxml, or whatever? And when they do, the only way you find that out is you try to install it and it tries to build, and then it fails. And then you have to go be like, "Read the error message," and you're like, "Oh, that's libxml," and you go install it. 

[0:09:44] KB: Yep. Yep. Yep. 

[0:09:44] MS: Whereas with a tool like Flox, you can say, "Actually, we want all these npm libraries, and libxml, and a compiler, and make, and -" that's part of the environment to start with. Therefore, I know it works on your machine because I've already included everything that you would need to have that build succeed. I think that's an advantage. 

[0:10:01] KB: Yeah. Yeah. Yeah. Yeah. It's reminding me of - what was it? Vagrant was trying to do this back in the day as well, of just like having essentially the dev environment be part of what's shipped in the code repo of just like this is all the pieces you need. In that case, it was do Vagrant up. I don't know what the equivalent Nix command is. But everything will happen. 

[0:10:20] MS: Yeah. Yeah. In a lot of cases, you have a Nix definitions file, like a default.nix or a flake.nix, and then you spawn a shell that has all those properties. Basically, a shell that has those libraries available, those tools available, things like that. 

[0:10:34] KB: Conceptually, this sounds beautifully simple. And yet, you said the core problem people run into is, "Oh, the Nix guy left," which to me is like, "Okay, I've played that as well." Like, "Oh, we have one sysadmin who understands how everything is played out, whatever, and nobody else understands." What makes the sort of configuration complex? What are the things that you are then having to package over with Flox? 

[0:10:57] MS: Yeah. Nix fundamentally starts with a build. The very first thing you define is how do you build this project? And you're like, "I don't have a project." What does that mean? When you first start. And you start with a build file basically, and you have to say, "Well, here's my dependencies, or here's what I need." And that's pretty foreign to the average person that is writing software in an enterprise. Usually, the first thing they're doing is, "Well, I have Python installed. And then I go and write some Python files. And then I do a Python run, or run my Python unit test, or whatever." And then I think about delivering that in a payload of some kind, whether that's a container or whether that's - we SCP a big ZIP file out to something and run it, or whatever. 

Starting with the build, to most people, feels like you're starting in the middle of the SDLC and not starting at the beginning. And so that was one of the very first things we had to do was like, "Okay, we actually have to modify the way that Nix has these opinions." So that when people want to start working they start working instead of being like, "Why do I have to start at step four to go back to step one?" That was one of the very first things we did where you just do a Flox init and it kind of sets up the scaffolding for all those things. And then you can do a search, like a Flox search, a Flox install, a Flox upgrade. Kind of more like after Homebrew semantics versus these Nix things, which are all like Nix shell-p. And then you have like a flake command with like this weird URI you have to attach to it. It's a very odd syntax is probably what I would say. And even like some of the experimental features that you have to enable under the hood to make things work, you can tell that a UX interface designer was not very involved in the creation. 

[0:12:22] KB: Yeah, it's reminding me of - often, I'm somebody who tends to straddle backend and front end. But often, when you're interacting with one side of the other, you sort of realize that people have not done the work of translating their mental model to the mental model of the person who wants to connect with this thing. 

[0:12:37] MS: Right. 

[0:12:38] KB: And so, okay, what I'm hearing is Nix was built with a very, in some ways, low-level mental model and did not necessarily do the work of bridging that to your average developer who wants to just get started on their box. 

[0:12:53] MS: Yes, I would say it was a very academic-focused project in how does Linux distro assembly happen? How do you have side-by-side versions of software? How do you functionally do a lot of things where you don't have side effects within the system? And that's a really cool principle, but most people aren't trying to do that daily at work. And so we were saying, "Well, how can we leverage these principles without really making you think about all of those powers?" But in the end, you do get a complete bill of materials with every dependency that you've ever installed because we have a complete bookkeeping of what happened there. And so there's benefits there if you can work through it. And so we're trying to give you the benefits with opinionated workflows rather than please assemble all these parts yourself and figure it out. 

[0:13:29] KB: Absolutely. I'm excited to move into that one more question about Nix before we do. I think one of the big innovations that Homebrew, which you mentioned, had was it was I believe possibly the first package manager that was all in user space. Didn't require you to have root access in your box to put things in place. Didn't give you nearly as many levers to f-up your box if you manage to do this the wrong way. Is Nix operating also in user space, or you need to have system privileges, or how does that all work? 

[0:13:55] MS: You basically need to have system privileges to install it. The same with Homebrew. Because if you're running into user local or whatever, usually you have to be able to create that. But I guess it's now opt/homebrew in the modern M1 or Silicon world. But after that, you are running in user space. And actually, you're basically only operating in this Nix store area that everything is by default read-only in there, because these things shouldn't change, they're immutable. Because it's functional, you don't want side effects. The side effects need - if you do have side effects, need to happen elsewhere. They might happen in your environment. They might happen in your work directory where you are making a configuration change. But in the Nix store, it's read-only. Your security vector, your attack surface is changed. It changes the shape and size of it. 

[0:14:35] KB: Yeah. And reduces the volume of foot guns that you end up with. 

[0:14:40] MS: Right. And there are ways that you can override that with settings and all that. But by default, I would say you actually end up in a quite secure multi-user package management system that is still user-space run. 

[0:14:50] KB: Nice. Okay. Going into Flox a little bit, you mentioned some of the pieces here around having some opinionated kind of init and different standard workflows. But what is the mental model that you are creating kind of that molding from the low-level Nix model to what a developer actually expects? 

[0:15:08] MS: Basically, we want you to create an environment that you work within. And so if you have a project, I'm going to go build my next end tier app. It's got a database. It's got a caching layer like Redis. It's got Next.js on the front end or whatever. I can do Flox init. I can go search for, "Okay, what Postgres versions are available." I'm going to install Postgres 16, or whatever. I'm going to go grab Redis. I'm going to grab a Redis library for Node. I'm going to install Node, or I can install Bun, or whatever your favorite JS runtime is these days. There's too many of them. And then I'm going to go just run my Next.js thing. 

And so it might be that I do Flox search for the different packages, Flox install. Once those installed on the backend, there's a lock file that's written with every dependency that comes in. When you grab Node, what does it depend on? What does it link against? All the way down to libc, which is fascinating, because it's a deep recursive tree, but it also means that there is literally nothing in that closure of software that is not needed. There's nothing in there that's not needed. And so that's how you can have the finished footprint in terms of what you actually require, and also the lowest attack surface. And so that's one of the things we're super interested in from just a security and compliance point of view. 

You build that environment, you install the software you want, and then you might run Next.js init, or however you start a Next.js project. I don't even remember off the top of my head. I don't write a lot of Next. And you can run it with your normal npm tools if you want to and just do it on top of that, because a lot of those have really good locking on top of it. And then you could say, "Okay, we've done that. Now, I've built my application. I want to deploy it." 

Well, from there, you have a couple different options. And one of the reasons we built these options is because not everybody's going to adopt a whole new SDLC overnight. And so you have to kind of meet people where they are. Where do you want the advantages of this? And where are you saying you're comfortable with how you're currently operating? It might be you have a great operating environment or a great developer environment. People on Mac, on Linux, they have compatible software, exact same versions, beautiful. 

And then you go to CI, and maybe you press a container from that. You just export it and you say, "Hey, I want to take what you have and put it in a container, and that's how I'm going to go run the rest of my workload." Okay. Great. Go do that. Or you could say, "Actually, I want to package the thing that I just built as its own package, because I want to run these packages out on my systems." Maybe you want to run bare metal because you want access to an AI model or something like that that's giant. You don't want to ship around a container every day. 

You say, "I want to package this up." And we have two packaging methods. We have one where, if you know Nix, you can write a Nix definition file and get it completely pure with everything optimized to the nth degree, complete reproducibility, all of that. A lot of people think that's really difficult. And so we built an entire thing within Flox that, if you can tell us what commands you run on the command line to build your project, whether that's npm build or whatever, you type those things, and you copy the output into a specific environment variable, we will make a package for that. And we just end up building a package for you on the backend. You publish that up to our Flox hub system, which is like the central clearing house of all these different environments and packages. And it's got RBAC on it and all those kinds of things. But you could have that deployed, and then you can run that environment. 

And coming up at CubeCon, we'll show you that you can run these environments natively through Kubernetes without even needing a container. And one of the really cool things there is I don't have to ship these layers back and forth. I don't have to ship a bunch of content. The only thing that happens is the content I need ends up where it needs to run because that's where we materialize the environment. 

What we ship around are environment definitions, not the payload. And so we're shipping small amounts of metadata bytes at a time versus these layers going back and forth on a Docker image. And it's not that Docker's done a bad thing. But in your normal workflow, you have a 10 gig image, the very first thing you do after you build it is you ship it up to a registry just to pull it back down where you want to run it. Now I've done 20 gigs of round trip. 

[0:18:34] KB: Well, and this is where that functional mindset really gains you a lot because you know that it's reproducible. You know that it's going to be idempotent, kind of do it. A couple questions I want to dive into on a couple of those pieces. First off, starting in just the dev environment, before we get to deployment, you talked about, "Okay, I have a lock file. Goes all the way down to my versions of lib C." How does that interact with the multiplatform aspect of it of like, "I'm on a Mac, versus this person's on Linux?" Like the build chains are very different in some cases. 

[0:19:03] MS: They absolutely are. And so we do calculations, our catalog, which is where you go get the packages. We have built basically an inference engine on top of Nix packages and that catalog that does a lot of the calculations for that. And I would say we know which builds are guaranteed to build. Which ones are cached locally on cache.nixos.org. We have all of that inventory. Then we do a resolve request, where it's like, "Okay, you're looking for this version of Go and this version of Node? Okay, can we resolve that to have that be at the same time?" And if we can't, because they're not contemporaneous with each other - say, the version of Go is from two years ago and the version of Node is from today. Well, on a rolling release, you have to pick a point where you're going to land. Okay? So, how do you solve that? Well, we solve that by actually, do you want to break that into two different pointers? Because we can do that. 

[0:19:46] KB: Oh, that's interesting. Okay. 

[0:19:48] MS: And so we can break that into, "Okay, I'm going to have one closure," that is that version of Go from two years ago. And I'm going to have a second closure that is this modern version of Node today. And there's a third closure that encompasses both of those together. And that's how they move in a group. And so we have package groups and things like that. 

But the way that we do that across platform is, as we calculate that resolution of, "Okay, you want this older version of Go," we go and look and say, "Is this available on Mac?" If you're on Linux x86, is this available on Mac? Is this available on Mac ARM is this available on Mac x86? Is this available on Linux ARM and Linux 86?" And we'll start to make calculate - we do calculations based on how many of the constraints can we satisfy. 

And at some point, you may drop below a threshold and we may say constraints are too tight. We're unable to resolve this. And we'll tell you that." Or what you can say is, actually, I don't care about Mac x86 anymore because everybody in my shop is on Apple Silicon. Okay, take that one out. And maybe that relaxes the constraints enough that now we have a resolve that can happen appropriately. 

[0:20:43] KB: Got it. 

[0:20:43] MS: But there is a lot of logic in that, I will say. And that's where the bread and butter of a lot of stuff is going on. 

[0:20:49] KB: I kind of want to think about it, right? As you start to dive down that, and you can say, "Okay, we can resolve at the same level of Go." And go is depending on, for example - I don't actually know what it depends on underneath, but a set of C-related libraries. 

[0:21:03] MS: Yeah, very few, very minimal. Yeah. 

[0:21:04] KB: Go is a great one then because it's going to be easy. And actually, I love using Go because it is so easy to package and ship anywhere. What's an example that actually would have a lot of different dependencies? 

[0:21:13] MS: I mean, Ruby or Python are both pretty entangled. 

[0:21:17] KB: Let's use Python, because probably slightly more people familiar with the challenges there. I have this version of Python. I know that that exists on both my Mac Apple Silicon and the Intel x86 box that I'm shipping to or whatever it is. That depends on n-different system libraries. And it's version this on this side for that version of Python, but it's version something else on the Intel side. I guess the question is do you ship a lock file per environment? When you say you lock all the way down to the lowest level of system dependencies, how is that closed over when you've got a bunch of different environments? 

[0:21:53] MS: Well, we're also providing those dependencies. We're not necessarily dependent on are you running Tacoma or are you running some other version of Mac OS. We're dependent on - we included the libc that you needed all the way down. And so you're getting that from the Nix and the Flox catalog.

[0:22:09] KB: Got it. Got it. Got it. Okay. Right. Buz you control the whole thing. It's not I'm looking to see what you've got and - got it. Got it. 

[0:22:15] MS: Yeah, in nearly every case. There are some cases on Mac specifically where you want to link into like the Mac frameworks, like MFC or whatever they call the Apple Darwin frameworks. I don't know what they actually are named anymore, because they have changed names like four times. 

But if you're going to write something that involves Xcode, for example, we're not going to ship you Xcode because we can't. That's not a licensed thing that we're allowed to do. And so we're going to link out to a SystemX code if we can find it. But those are, I would say, a small minority of cases that that's really going on. And so in most cases - and there are some libraries that are just not available ever on Mac or never available on Linux. There are some utilities. 

Again, most of the time, we can figure that out, and we'll actually only lock it on the systems that it's valid for. But sometimes there's an odd edge case or something. But most of the time, we're giving you the thing you want to type on the command line and all of the libraries that are linked to it underneath and all of the things that are important for that to be able to execute properly. And that means you have the exact same version of those libraries on all your platforms. 

[0:23:12] KB: Okay. Awesome. So now looking a little bit more at the deployment side of things, one of the things that is potentially interesting here is looking at - let's use the Postgres example that you've got, right? Locally, I say, "I want this version of Postgres. Nix, , you have your packaged version. You've locked it. You send it to me, whatever." And then I say, "Okay, I want to deploy, but I actually want to use AWS's cloud Postgres," or whoever it is. What is the decomposition for deployment look like? 

[0:23:39] MS: You have a few different options there, which, every time I say there's a lot of options, I'm like, "Hmm. Do I need to be more opinionated on something?" We might. But what you would do is maybe take the packages of the things you want and make a runtime environment that is smaller than your development environment. By default, every environment in Flox has two modes, developer environment and runtime, where it's basically do you enable the compilers? Do you enable the libraries and all that stuff? Or do you not? 

And for instance, if you run - there's a package called Almonds that is written in Python. If you run Almonds in development mode, you will also have the Python that runs Almonds available in path. If you run it in runtime mode, we're not going to put Python in path because you probably don't want to grab that Python if you're just going to type Python on the command line. That's the difference of runtime mode versus developer mode. 

But you might make a separate environment that only has the web content and the caching layer but doesn't have that Postgres because you know that you're just going to have a connection string that goes out to RDS. And so that's something you could do is I'm just going to have a smaller environment that uses the exact same packages. So I still know it's reproducible one by one because I have the exact same packages in these environments. I've just made a smaller environment that I'm going to deploy at runtime. That's an option. Another option would just be change the connection string. And you still have Postgres sitting there, and you're just not using it. 

[0:24:49] KB: All right. I think I'm understanding the picture now. Let's move a little bit into kind of modern day. Modern day software development, there's a lot of changes going on. And I noticed that the phrases like agentic coding or agentic development environments showed up in Flox's message. 

Again, we as engineers are all trying to figure out how do we best use these tools. How do you think about what makes for a good agentic development environment? And what does that actually connect to with regards to Flox? 

[0:25:16] MS: I guess there's like the marketing forward answer and maybe the engineering cynical answer. And I kind of have a little bit of belief in both. Ultimately, agents are built to model humans. And so, if humans do better with consistency, I would think that agents would as well. 

And so from a lot of those perspectives, it's how much of my entire workload can I keep at a deterministic level? Because I'm now running these probabilistic workloads on top of things. And that is nuts from anybody that came from computer science. All we've been seeking is determinism for years and years and years and years. And now we're like, "What if determinism isn't the answer?" My brain exploded, and I am now trying to reconcile that. 

And so me coming from a background of determinism and item potency being fundamental principles of what we've built over the years, I'm like, "Okay, can I keep 80% of my stack deterministic?" Because maybe how many variables do I need to have in play at once is really the question. 

And so if I have a consistent environment that I'm working in both in development and at runtime, hopefully, those agents are more successful because either they're not having to account for variance as much in different parts of the environment or they're spending all their context window on the stuff that I'm asking them to do versus the, "Oh, this didn't work. Let me run this unit test again." "Oh, let me go grab this." "Oh, you didn't install ImageMagick when you were trying to compile this Node thing." No, let's have all that ready and burn my tokens on the things that I'm actually trying to accomplish. And I think that's more how I feel about the agent coding right now. But ask me again in four months, and I bet I'll have a different answer. 

[0:26:44] KB: Yeah, I think the determinism and non-determinism is fascinating. And maybe we can dig in a little bit more there because I think there's a few different angles I think about it. Why have agentic workloads taken off in coding and not in other places? I think it's because humans are also nondeterministic. But we've been trying to drive non-deterministic human coders towards a deterministic output as long as the industry's been around. We've got all these tools for like how do you harness our random chaos into something useful also? 

[0:27:11] MS: Yeah. Well, I think most of computer programming has been like mapping a human mind onto a machine and like training your brain to think like the machine. And you get to these agents, and these agents have taken a step more toward they think a little bit more like a person, or at least they behave a little bit more like a thinking person. And so now, you may not have to think the exact same way you did the entire time when you were working with these traditional classic computer problems. But we're also now seeing software being developed more with agents in mind. And how do you make this agentic? And all these adjectives.

And in some cases, it's like, "Well, actually this is more human-friendly as well." And so you get this weird balance of are we meeting in the middle in some cases? But I still want to have the fewest number of variables in play always. And that's better for the machine, that's better for the human. However I can do that. 

[0:27:54] KB: Totally. Well, and I think a thing that I've definitely seen is like the better we follow software engineering best practices, the better the agents do with it also. And I can definitely see that. Thinking about putting Flox in play here, if someone is really embracing the agentic workflow, are they running Flox? Is that setting up the environment that the agents acting in? Are you actually exposing Flox itself and the core CLI pieces to the agents, and they're setting up the environment? How are you actually using or seeing people use Flox in these environments? 

[0:28:30] MS: There are two patterns that have emerged right now, and I think - I don't know. I'm trying to figure out if I have a preference for one of them. I don't know that I do yet. But one is you kind of create the environment that you want. If you know you're going to build an application Node, and Redis, and Postgres, or whatever, you kind of get all that stuff installed, and then you launch Cursor or whatever within that environment. Maybe Cursor-dot. And you open up that environment as it's been activated. And now all those tools are available to the agent, and you just say, "Hey, I want to go build this." And it finds Python on the path, or it finds Node on the path, and it just goes and uses it. It doesn't even think about it after that. 

It's like it doesn't know that you're in this great reproducible environment with excellent software supply chain tracking. It's just like - I don't know. It's like which Python showed up, I'm good. And that's totally fine. There's another way where you start with nothing and you launch your IDE, or your agent, or whatever. And you flip in our MCP server and you just say, "Go build a Go project." And it's like, "Okay, I'm going to go search the Flox catalog for versions of Go available. Which one's current? Cool. I'm going to go get libraries." 

And we have hints and Cursor rules files, or Claude MDs, or whatever it is that you need to do, so that they know how to work with these Flox environments and use Flox commands for everything. And they're not falling back into system programming to be like, "Oh, let me go brew install this thing." It's like, "No, no, we've given you rules that we want you to do it all with Flox." 

And the issue with that is sometimes context windows get resized and sometimes that - it works, I would say, like 95% of the time. But there are times where you're like, "Why did it go do that?" And that's the problem with getting things. 

[0:29:58] KB: I feel like that is the story of everything with these agents. It works most of the time. And sometimes you're like, "What?" 

[0:30:05] MS: Well, my favorite thing is I can ask it why. That's actually one of the cooler things you can do with an AI agent. Like, "Why did you go do that?" It'd be like, "Oh, because of this and this." You're like, "Huh, I care more about this set of trade-offs." It's like, "Oh, well, if that's what you care about, we would go do it this other way." I'm like, "Okay, let's go down that path." And I am absolutely correct. So, yeah. 

[0:30:21] KB: Somebody at some point, they said, "You're absolutely right." And they called that a yar. And so now every time I imagine Claude in this pirate voice of "yar". 

[0:30:32] MS: Yes. Yes. 

[0:30:34] KB: One of the things that we touched on pre-show that I want to kind of bring in here was thinking about engineering efficiency. And I think one of the big things with the world that you've been in and all this package management and stuff like this, done right, it helps so much with efficiency. I've had to dive into this much more as my career progressed. The first job I had, there was a build engineer. He just made everything work. I didn't have to worry about it at all. And it was incredible. I just work on my thing. And he's got all the build engineer, all of that. 

And over time, you become that person, or you become the one who has to manage the infra or do all those different things. But I think doing package management well really supports efficiency. Is that changing now as we sort of move into this agentic future? Are there other things that we need to be thinking about? 

[0:31:23] MS: I mean, I'm sure there are other things you need to be thinking about. I'm not going to speak in absolutes generally. But overall, if you're really good at something, I feel like the AI stuff is amplifying it. If you're really bad, the AI stuff is amplifying it. And so, it's kind of what are your practices already? Because you're getting more of them. 

And so, with a packaging perspective, if you go back to like the state of DevOps from like 2016, 2017, 2018, these were things I was semi-involved in. 2018 and on, I was very involved. But one of the things that leads to success the most is reducing the number of variables. So, we're back to this consistency play. We're back to this repeatability, reproducibility play. And more successful teams have fewer ways of doing something. They have fewer ways. They don't run on 17 different OSs. They run on one. They don't run on five different programming languages. They have two. Maybe they have a static type and a dynamic type language that they go standardize on, things like that. 

And so, the fewer variables you have in play, the more likely you are to be success. That actually correlates with success in terms of who's in the elite tier pretty highly. And I think that that's the same thing that would happen with AI. It's like if you can teach your agents that you have one way of doing something, now it goes and does - we have one way that we connect over to Postgres so that we're guaranteed to have failover on when RDS has a hiccup or whatever. Okay. Great. You have this one way that you always use because you've written your own client library, or you import one and overrode it, or whatever you do. It never has to think about that really again. And so now it's on to the next set of problems. 

And I think that's the exact same way a human would be. If you have a really good set of libraries, they're going to be like, "Well, I'm just going to include the one that the platform team gave me because I don't have to think about it anymore." And they move on to the next part of the business logic. 

So, I think what you're seeing with a lot of the engineering performance is, if you already had good practices, AI takes advantage of it. It's basically everything you're already doing, what if it just happens way faster? What happens? And if you're kind of fumbling and you're not sure what you're doing, that just happens faster. Whereas, if you know what outcomes you're looking for and how to measure them, that also can happen faster. I think that's my quick summary on that. Does that work? 

[0:33:17] KB: Yeah. No, I think that makes a lot of sense. And it is very similar to a thing that I've noticed. Yeah, it just amplifies whatever's going on. 

[0:33:23] MS: Right. I mean, I will say that a lot of the agentic development stuff kind of makes you think much more about specification. And what are you really trying to build? And then validation. Did I actually get the thing that I was asking for? And I think validation's always been the hardest problem in software engineering in my opinion. QA, grimly underpaid for the entirety of its existence. And it's the hardest problem of knowing is this thing actually correct? Sure, it compiles. I can run it. But is it solving the user need is a really hard thing to know. 

And then specifying I'm trying to build this. But what is this? And how well defined is it? Okay, if I use this connection string versus this type? Is it okay if I have a model that pulls in 17 different dependencies, or only one, or whatever is going on? What I found is that people that have a little bit more product management background or product engineering background are a lot more successful with those things than the people that don't at this point. And so that's been a fascinating turn as well. 

[0:34:13] KB: Has it changed how you at Flox are operating? 

[0:34:16] MS: A little. We're starting to evolve on that. I think the tools are moving faster than the humans are at this point, which is a fun thing to kind of consume. But we are definitely trying to use agents to speed up certain types of development work. And there's definitely the skepticism. I have found that your skepticism correlates directly with your seniority. The more senior you are, the more skeptical you generally are of AI. And the more junior you are, the less you are, which is fascinating to me. Because more senior people generally write less code. They're actually spending more time in architecture and all that. But they're the ones that are more skeptical of it, which is - I don't know. It's an interesting set of tradeoffs that I've observed. 

sI will say that I've written more code in the last year than I probably have in the last five combined because of being able to use agents. And I find that the spec - I don't have too much problem specifying. I spend more time on my workflow for how do I get specifications in a way that I think is consumable and iterable versus just the single super prompt? Come on. That's not going to work, at least not currently. Context windows are too small for that. 

But the way that we operate, I would say we have several areas where agents are doing a lot of development. We have several where they're doing some of the fast part of the code review, maybe the initial parts of the code review. We have some that are kind of looking at tests and seeing, "Are these flaky? Are they not?" Things like that. But not every developer is spending all day just writing prompts or anything like that. We're definitely not there. But we're trying to keep our eye out for when is that happening. Or is it actually giving us an advantage in any spot, or is it actually cost us twice as much now because it writes bad code that we have to go fix? And in some cases, it does. And so you have to kind of really - we're still learning our way through it, I guess. That's probably the simplest way to say it. 

[0:35:47] KB: Well, and it's interesting because your primary audience that you're selling to is developers who are also going through all of these exact challenges as they're going through. So, has this changed at all the way that you're thinking about Flox's product roadmap and what you're trying to do to serve people? 

[0:36:02] MS: Certainly. That's a clear answer of yes, in that one of our major partnerships announcements a month and a half ago was with NVIDIA, and it was so that we can redistribute CUDA within Flox, and so that you can have a native library of this. Because generally, they didn't used to allow that redistribution properly. It was kind of a non-free piece of software and things like that. And so we have a distribution right. So that when you want to have a fully functional PyTorch environment to go do your model training or whatever, you can have CUDA in there. That was one of the first things we jumped on. We're like, "Oh, my gosh. We can go partner with NVIDIA. We can go get CUDA available. We can make these workbench environments for CUDA, or TensorFlow, or whatever your favorite ML tools are." And that was a really exciting partnership. And I don't think we would have done that had Agentic coding not been as important as it is. 

And then you start looking at, "Well, what about MCP servers?" Well, okay, everybody's asking - MCP server, the idea of it was made public in November. By March, I have people asking me every day, "Why do you not have an MCP server? I've never seen something like that move that fast." And so, we have an MCP server. And we didn't have it in March. I'll say that. But we do have an MCP server. And we're still adding things to it all the time because the specification gets updated. Or you go use a different MCP, "Oh, I like the way that worked." And you go borrow from that, and you kind of implement that. And we're actually using a lot of the agents to help us write the MCP server. Because you know what knows a lot about how AI works? AI. You can start that. 

[0:37:22] KB: Though, it's shockingly bad at writing good Python, which you would expect. They use Python for these things. You expect to be good at it. 

[0:37:29] MS: Well, the thing is, with the way it trains, though, it takes the entire compendium of knowledge out there. And you have to ask yourself, "Well, how much Python's out there?" A lot. How much of it is good? But it trained on all of it, because it can't tell. 

[0:37:41] KB: The density of quality is a little lower in Python, yeah. 

[0:37:45] MS: That's actually one of the reasons I like working with go with agents more than anything is because there's not - idiomatic Go doesn't look that different, whether it's generated or whether I wrote it. The variable names might not be the same. But outside of that, the structures and the flow through the code is usually quite similar, and I find that to be a little bit beautiful. 

[0:38:00] KB: I have also actually have talked to multiple people on this show who have found that Go is possibly the best language with regards to agentic generation. There's multiple factors to it. And we don't need to dive into those. But yeah, it's great to hear you say that as well. It feels like I'm getting lots of nudges for my priors to be high there. 

[0:38:19] MS: Yeah. Yeah. I mean, and what's been fascinating though is watching it improve over time. When we first started working - Flox is written in Rust. Again, we're pretty security-minded. We want to make sure we don't have buffer overflows and all that. When I first started working with AI and Rust, it was terrible. It was just awful. And now it is pretty decent. It's pretty good a lot of times. And sometimes it's quite good. 

Even our core Rust team, I turned in a patch that was written by Cursor, and they were like, "Oh, we had a helper method for this and a helper method for this. Can you flip that?" But my code review was actually pretty simplistic. And there were a few things. They were more educating me about how the code worked, less that the AI did it wrong. It was more let's just have a back and forth about the behavior here. And the stuff got merged. And I think if I'd have tried that six months ago or a year ago, I don't think that would have happened at all. And so these models are moving quickly and improving. Even if it's a language you - if you don't want to write Go all day, you're still probably going to be okay. 

[0:39:10] KB: Yeah. You talked some about how it's already kind of influencing you. And the the push to MCP has been fascinating. And if we want maybe - actually, that might be worth diving a little bit into. I feel like writing a good MCP server is actually not trivial, because you don't want to just dump everything into the context to blow out your context window, as you say. What have you all learned through the process of this race to expose everything in MCP? 

[0:39:36] MS: I think we started with, "Okay, how do you find software?" If you're trying to build an environment, can we give you basically interfaces into our catalog to be like, "Okay, I'm looking for Python. I'm looking for Python. What versions of Python can I browse from this catalog?" And we might have something back to 2.7 all the way up to whatever came out last week. And so there's maybe several dozen versions, and go select and say, "Okay, I want this version, which is usually fairly recent, but might not be the latest because it doesn't - the latest one even existed." But our MCP server is going to be, "Okay, to find software, here's what you need to know." It'll kind of have a built-in prompt of finding software. You run these commands, you look at it this way. Here's the output parsing. And throw the JSON flag at it so you can parse it easier, things like that. 

And then it might be, well, for running software, here's the instruction set. And so you don't have to load the finding software and the running software instruction set at the same time. I would rather you drop one of those out of the context window so that you're minimizing the overall fill-up. But overall, we've added tools, we've added several things. But I would say our primary engineer on this is just playing with other people's MCP servers and be like, "What do I like about this one?" And the Supabase MCP server, for example, people seem to really. And so it's like, "Well, let's go play with that one and see why they like it. What's good about it?" Or the Postgres one, a lot of people seem to like. 

And then you go to others, and they're like, "Well, this one has an MCP server, but I don't think it's really doing much for me." And so we play with that one for a couple minutes, and we throw it away. One of the other things that we're trying to do with Flox with MCP is, okay, if you want to run an MCP server, which some people are writing them and some people are running them, how do you restrict it? Because security has a new model with this. 

Before, you had transport layer and storage layer, encryption, security, SSL, that kind of stuff. That's classic. We know that. Then you have identity security of API keys, authentication, non-human identities, which we still haven't gotten right in any way, but we're going to move past it. And then you have like morals and ethics, where it's like, "Okay. Are you allowed to use this tool? Are you allowed to use blackmail?" How do you get the way the things you want? We have no idea how to handle that security. And at Flox, we don't yet either. I'll just be honest. I have no idea how to tell an agent don't do blackmail, other than say don't do blackmail. But if it decides it wants to do it anyway, it might just do it. I don't know. 

One thing we can do is be like, "Okay, in this MCP server, in the default build, it has a network sniffer built in. Well, in our build, we'll actually ship you a build without this network sniffer. And therefore, you have one less security vector that you have to go worry about, or something like that. So you can say here's the tools that are available to the MCP server. And we're working with somebody that is running other MCP servers and distributing them, and they really like that they can have different options of I want to run this with fewer sets of tools underneath. And, in fact, it won't even be on path. So you can't just go grab it. And that's what they like. You've taken the toy away from the baby, so they can't hurt themselves with it. 

[0:42:16] KB: I mean, I think that speaks to one of the really interesting things about what you guys do in terms of being able to really constrict what any particular thing has access to. And I've talked with folks who are saying, "Pretty soon, you're going to only ever want to run Cursor or any of these other things inside a Docker container, because you don't want it to be able to go and access -" right? Especially if you're using an MCP server. You don't know. Could that be hijacked? All these different potential things. Is that a use case you're seeing people start to use Nix for in terms of like how do I lock down the environment that this particular thing I'm running is going to be running in? 

[0:42:53] MS: You certainly can. It depends. With Nix, there's a thing called a pure activation, which basically dumps the environment before you get into this one. If you don't include coreutils, you don't get LS in this environment, that kind of thing. And so you have to be really explicit. Do you even have a shell? Things like that. It will grab a shell if you don't have one. But it's stuff at that level, where it's like - and so from there, there is no editor. You can't go call VI. Or there is no curl. You can't go out to the network or whatever. And you can even activate those environments in a way that says network turned off. We have sandboxing modes, things like that. 

There are ways to do it, but some of them are great approximations for security. They're there, but they're not totally bulletproof. And there's others that are excellent. And so it just depends on what you're doing exactly. But then, also, you can start to be like, "Well, can I take advantage of cgroups within the Linux kernel the way that Docker does? Can I do that with a Flox environment?" And the answer is yes. Yes, with Kubernetes. We sit on top of containerd. And we have a shim that we put in there. You can totally do that. We give you the same isolation basically that a container would because we're running via containerd without a container, which is kind of fun. 

[0:44:02] KB: Yeah.

[0:44:02] MS: And so we have a lot of those same safety affordances that people have kind of come to trust and learn with Kubernetes. And those are all APIs that people are familiar with and used to, extension points and all that. We were like, "Well, I don't need to reinvent all of that. If there's an ecosystem that people have agreed upon, why don't we just extend the ecosystem a little bit? Put our tool sets in there. And if people like it, great." Or if they want to do hybrid things, where they have a pod that has five containers and one Flox environment. Cool. That's fine. Go do it. 

[0:44:29] KB: That's fascinating. With the Kubernetes, are you thinking - because, first, I was thinking about this as like, "Okay, this is useful in terms of managing a single environment." But now we're starting to talk about things like orchestration, and how you navigate deploying complex sets of interacting services and things like that. Is that a space that Flox is targeting? 

[0:44:52] MS: Kind of. The Flox environment can do a lot. And we're not constrained by some of the same dogma that I would say that containers and the CNCF have kind of built around Kubernetes. Everybody says, "If you're running more than one process in a container, you're doing it wrong." That's not a thing that we have in a Flox environment. You want to run multiple processes, rock on, man. Go do it. 

Sometimes applications need more than one thing running to be successful. And that's basically why a pod was built because now they needed a unit to go talk about all these different running processes. Well, we can just give you a Flox environment that can run multiple processes if that's what you need to do. Sometimes you just want your Redis and your Postgres to both start instead of having them be separate containers that you have to figure out and manage. 

I'm not going to say one's totally right or wrong, but we almost got religious about this. And I just don't think there's a lot of value there. And so we're challenging some of the orthodoxy of running things within Kubernetes a little bit, but it's pretty cool. And we've worked with some prominent people in the CNCF, and they're like, "This is pretty cool." No one's saying this is the worst idea ever. They're just like, "We haven't looked at it from this point of view for a while." 

But when you get to orchestration, then it's like, "Well, do you want one copy of this environment? Do you want it in a DaemonSet that's running across all of your nodes and workers, or do you want this thing to fail over? Do you want a minimum number of pods that are deploying this environment? Right now, we're giving orchestration primarily to Kubernetes because that's what most people are running. And I don't need to necessarily reinvent something. I'm not going to succeed if I do. See Mesos, see Nomad. A few others that they're successful in pockets but certainly don't have the market share that Kubernetes does. 

And so for us, if we're going to get into the runtime, let's at least play in that space. Now, you can also just run our environments on metal. They have a service manager. And you can just run them without an orchestration suite at all if you want to in a lot of cases. That is very simple, which means you can use strace for debugging. And I don't have to go launch a debug pod or anything like that, and that's kind of beautiful in a lot of cases. 

One of the things that I really like about it is we can get kind of as simple as you need or as complex as you need. If you want to get into Kubernetes and need a service mesh and all that, rock on. Or if you say, "I don't need all that complexity right now. I just need to run a web server and a database," we can do that really easily within the environment. 

[0:46:59] KB: No. I think that's really important. Being able to kind of scale up and down in terms of complexity and in configuration. I think that's a thing we didn't talk about as much, but I'd like to dive into just for a minute here, which is, as we talked about kind of in the beginning, Nix exposes a ton of knobs, has a ton of power, a little bit sharp around the edges, hard to work with. Flox, you're doing sort of these opinionated stacks on top of that that allow people to kind of do what they want to do very quickly. 

Now, if you start a project in Flox, and you get to a point where you say, "Hey, I actually need to turn some knobs. I need to go down there. I'm willing to do the learning curve. I'm willing to do that." How easy is it to incrementally extend and take advantage of all those underlying pieces in Nix? Do they play nicely with your stacks? Are you able to sort of swap things out? Or is it like all or nothing? 

[0:47:49] MS: It's definitely not all or nothing. We have a few kind of - we usually call them exit ramps, or extension points, or whatever that you can go back into pure Nix if you want to. For instance, if you want to define a build definition with pure Nix because you really understand the primitives for the build system, we don't expose most of those to you. There are hundreds, and they're awesome in a lot of cases. 

But if you know what you're doing, I would recommend doing that in a lot of cases if you really know what you're doing. If you don't, we'll give you the fast, easy way forward, and it'll be mostly correct. But that's one. The other is if you want to use Nix-based ecosystem tools that are exposed as flakes. And the really simple, not quite correct definition of a flake in Nix is, if you've ever used Ubuntu and they have a PPA, which is like a personal package archive, which is just this random person's little app repository for their tooling, that's kind of what a flake is. It's the definition of this one piece of software or maybe a few pieces of software that is in a separate package repository that is not exactly correct. But mentally model it is close enough. 

And so you can go out and get a flake. And now that is not - that resolver did not happen with our catalog. So we can't say, "Hey, this is going to be guaranteed to work at the same time as these other pieces of software you have and all that. You've gone through a little bit of an escape hatch to do that. In a lot of cases, it's going to work fine, especially if your own tool isn't linking back to other things in the environment. If it's standalone or whatever, you'll be fine. We offer that. 

Some of the other things we do expose in Flox would be very familiar to a Nix user where they might have a shell.Nix. And we have our Flox manifest, which is written in TOML. And it has all your package definitions. And so all those imperative commands I talked about, search, install, upgrade, they're editing that TOML file. But you can go in there and edit it yourself and say, "Actually, when you load and activate this environment, here are the environment variables that I want to be active. Or here are the shell aliases that everybody should use." We have project up, and it just starts everything that we want it to. And pulls in a fixture for the database, or whatever, during development. You can define those aliases and have that all happen. Or you can have like MOTD style thing when you activate the environment. It says, "Hey, your application is now serving out on port 8000," things like that. And those are all things that basically we translated and passed through from Nix into the way that we operate. And some of it we've built on a little more. Nix kind of assumes you're running bash. We have made it so that if you're a fish user or a Zsh user, you're still good. Those are things we've added. Yeah. 

[0:50:03] KB: Cool. Well, looking forward, what's next? What's coming down the pike for Flox over the next - I mean, do we dare scan out six months? Or in the current world, do we only go out a few weeks? 

[0:50:15] MS: I mean, I feel safe for about a few weeks, but I'll pontificate all you want. We're really trying to get into the runtime in production side of things right now. We spent, I would say, the better part of two years on this developer experience, and I think we've got it quite nice. There was a day I realized that if I didn't work at Flox anymore, I would still use it for my personal development. I just love the way that it operates and works. And that was a pretty cool day. 

[0:50:37] KB: That's a good milestone. 

[0:50:38] MS: Yeah, it was a great milestone. And then we get some prospects and customers, and they're like, "I've used Nix enough to know that I want to use Flox." I'm like, "That's a great quote. I'm putting that on the website." Things like that. But there are others where, as we're looking for, we're looking at the runtime a lot more. And so, production environments. 

And Kubernetes, this is the first thing, it's launching at CubeCon North America here in 2025. We're really excited about that. We're really excited about that, and we want to see what the adoption looks like. And it may be that that steers us into investing in that area and making it have more bells, and whistles, and more feature sets. We're also really looking at agentic runtimes and trying to make sure that we give them the control points that it needs. Everybody's kind of relooking at the way they develop software right now because of all these agents. And if that's what you're looking at, it's a great time to look at Flox and be like, well, "If I'm changing the way I'm doing this anyway, is Flox a good time to enter here?" 

And so, can I give you control points? Because the code coming out of these agents, I think we all kind of need to assume it's hostile. We don't know that it is, but we don't know that it isn't. And since we don't know that that's not hostile, I'm going to assume it is. Which means that if I'm not going to have control at the development side, I need to put the control on the operator side. That's the SRE, the DevOps person, whoever's responsible for production. Could be the developers just wearing a different hat at that time. 

And so we need to be able to put a lot more control points on the runtime control plane. And so that's where I think Flox is going to be spending a lot of time and being like, "Okay, where are the safety nets? Where are the gates? Where are the checkpoints? Where's the monitors?" Because we don't know what's really happening here. 

[0:52:03] KB: That's fascinating. Yeah, I think we are trying to figure out how do we build stability again, determinism on top of this nondeterministic substrate. Yeah, I agree. I think having more abilities to sort of hook into what's actually going on out there. Because the amount of code being generated, there are probably teams that are still reading it all, but they're rare. 

[0:52:24] MS: I do think several teams are reading it, and I think that might actually be doing them a disservice. If you're actually reading it all and trying to review it all, did you review the compiler when it wrote assembly or a machine language? You probably didn't. And so did you trust it? And I don't think we're at that trust level yet. But I'm also figuring out like how will we know if we get there? And so there's definitely some knowledge term there. 

[0:52:45] KB: And what tools do we need to get there? Yeah, absolutely. Awesome. We're coming to the end of our time. Is there anything we haven't talked about yet today or that's come up and we sort of move beyond it that you think would be worth covering before we wrap? 

[0:52:59] MS: One of the key benefits of reproducibility, which I will hit on that over and over again. I'm a release engineer by trade. That's what I love is never doing the same thing twice, never doing the same rebuild. But a lot of people in their traditional workflow, they write something on their laptop, they run some tests, then they submit it to the Bless CI system, CircleCI, GitHub Actions, whatever, and it runs CI there. Maybe they get different failures there because they're running on a Mac locally on Linux there, or whatever. 

One of the things that we're really looking into and looking forward to is if you've defined all of your inputs, which is what software is available, and what your software build material, your supply chain, and you've run tests. And we know that that artifact is the exact same artifact because it's mathematically provable, why do I need to run tests again on the Bless system? If I've run tests locally and I have the artifact locally and I can say, "Well, the inputs are the same. We've hashed them all. The output's the same. We've hashed it all." I don't need to run tests again because we already have a receipt that says you ran them all and they worked. Now I can skip that entire part of the CI system. And maybe that's only subset of test because maybe developers only run unit tests locally. They don't run all the integration tests, or whatever. 

[0:54:03] KB: I'm wondering. Yeah, as you start saying this, can you map the dependency, the graph, as well? You're saying which tests do I need to run based on not only what has changed, but also what tests I've run locally and all of that. 

[0:54:13] MS: Right? We are not doing that today, for what it's worth. But I think that's the thing that I'm starting to get pretty excited about as I get back into this determinism. I'm like, "I can maybe cut parts of my CI bill." But I can also just cut time off the wall clock, which is actually what developers are looking for. They don't want to have to go up and get a coffee every time they submit something to CI to see if it works. They want to sometimes, but not every time. 

[0:54:32] KB: No, we want to go and get a coffee while Cursor works away on it. And then we've already drank our coffee. We're done. We want to ship it. 

[0:54:40] MS: Yeah, you got a good point there. Yeah, there's just a lot of things you can do with determinism. When you know the artifact you're dealing with is the same on this system, on that system, and we've already calculated it all, if you can start to put your tests inside that artifact and stuff like that, now you have proof that it's all working the same. And I think that's really exciting. I guess that's the one thing we didn't talk about was kind of that reproducibility end of like what do you really get for reproducibility? Yeah. 

[0:55:03] KB: Awesome.

[END]