EPISODE 1826

[INTRODUCTION]

[0:00:00] ANNOUNCER: Carbon is a programming language developed by Google as a successor to C++, and it aims to provide modern safety features while maintaining high performance. It's designed to offer seamless interoperability with C++ while addressing shortcomings of C++ such as slow compilation times and lack of memory safety. Carbon also introduces features like a more readable syntax, improved generics, and automatic memory management while still allowing low-level control. 

Chandler Carruth is the creator of Carbon, and he leads the C++, C Lang, and LLVM teams at Google, and he also worked on several pieces of Google's distributed build system. In this episode, he joins Kevin Ball to talk about Carbon and the future of the language. 

Kevin Ball, or Kball, is the Vice President of Engineering at Mento and an independent coach for engineers and engineering leaders. He co-founded and served as CTO for two companies, founded the San Diego JavaScript Meetup, and organizes the AI in Action Discussion Group through Latent Space. Check out the show notes to follow Kball on Twitter or LinkedIn, or visit his website, kball.llc. 

[INTERVIEW]

[0:01:21] KB: Chandler, welcome to the show. 

[0:01:23] CC: Really happy to be here. Really happy to be here. 

[0:01:25] KB: Yeah. I'm excited to dig in. Let's maybe start with a little bit about you, your background, and a little bit about what Carbon is and what led to it. 

[0:01:35] CC: Sure. Sure. I mean, I've been a software engineer at Google for a long time now. And when I started, I was pretty fresh out of university and didn't really know what I was doing, kind of got thrown into a bit of the deep end. And I was working on a C++ project with a bunch of amazing folks who are largely teaching me how to be a good software engineer. And I was incredibly frustrated because the C++ project was just holding me back. Every single step of the way working on the C++ codebase was painful. The tools were bad. The experience was bad. And it seemed like it really - it became an overwhelming kind of problem that we needed to solve. 

And that kind of kick started me diving a whole lot deeper into, first, compilers because I thought maybe the compiler was the problem. And as we got better and better compilers, then into the programming language. Joined the Standards Committee. Really tried to figure out ways we could much more radically improve C++, the experience for all the developers and also the quality of the actual software we're able to put out. 

At a certain point, we weren't actually able to achieve the kind of improvements we needed to. We kept seeing really fundamental risks facing C++, really fundamental problems that needed pretty radical changes to the programming language as a whole and kind of how the programming language worked in order to credibly address them. 

Today, I think that obvious and really great crystallization of this is around memory safety and the questions around whether we can use memory unsafe programming languages going forward. But that's just the best example of the fundamental problem that we need to make radical changes to really address programmer needs. And the committee, and the standardization process and the C++ language were not realistically going to be able to make the kinds of changes we thought were needed here to address that. And that's where we started working on Carbon. 

And the flip side of this is I work at Google and we have an incredibly large C++ code base. We have hundreds and hundreds of millions of lines of C++ code we've written. We depend on hundreds of millions of lines of C++ code that the rest of the industry has written in the open source community. And we can't throw that out. That's not going away. We don't have the luxury of rewriting all of that. We don't have the time or the teams to do that. And so we needed something we could actually bring all of that code along with us, but get the kind of radical transformative change we're hoping for here. And that's kind of how Carbon started off. And really what keeps me really excited about it is to see if we can make those transformative changes to the programming languages while bringing the C++ code and existing software along with us. 

[0:04:07] KB: Yeah, let's talk a little bit about what that means to bring it along. Because when I was looking at the website for Carbon, there was a great line which said, "Existing modern languages already provide an excellent developer experience: Go, Swift, Kotlin, Rust, and many more. Developers that can use one of those existing languages should." Now, what in this big legacy of C++ codebase keeps you from using one of these languages and connecting it through message passing or something similar? 

[0:04:33] CC: Sure. I mean, sometimes we do, and that's awesome. I mean, Google has almost as much Java code as it does C++ code, and that's great when we can use it. We have tons of Go code getting written. We're increasingly writing Rust code. When you can do that, when you can just have message passing or some kind of nice API boundary, some abstraction boundary between the C++ software you depend on and new code, that's fantastic. And using one of these languages is amazing. But these abstraction boundaries tend to need to be pretty robust abstraction boundaries. 

A message passing, you mentioned, is a great example of one. We actually have a lot of projects to try and get finer-grained abstraction boundaries, lower-cost abstraction boundaries, but it's hard. And so the question is, what about all of the code we need to write that doesn't have that abstraction boundary that is immersed in an incredibly complicated, often tangled mess of C++ code with no abstraction boundary in sight? We still want that code to move as well. And that's the code that's kind of getting left behind. 

[0:05:32] KB: Yeah, so this reminds me a lot of TypeScript in some ways, of TypeScript being something that you can take an existing, extremely messy JavaScript code base and start just file by file as you're adding new things or you're touching something, upgrading. And then once you compile it, the program doesn't know. It's just fine. 

[0:05:51] CC: Yeah, absolutely. And I think TypeScript's a great thing to think about because there are also different approaches you can take here. TypeScript takes a really important approach. It basically is a superset. If you have valid JavaScript code, it's basically valid TypeScript code as well. This is super valuable for TypeScript and for languages historically. C++ did this with C. C to a certain extent did this with B before in BCPL. This is a very, very valuable approach, but it has some limitations. You can't make as radical of changes with that approach. 

And for a long time, this was still the only real approach we had. But I think with languages like Swift and Kotlin, we're seeing a different kind of way to do this, where instead of being just a strict superset, you actually have some deep and fine-grained interoperability, but the new code gets to be written entirely in this kind of fairly new, clean space. That really relies on very advanced compiler technology that's kind of come around in the last few decades. And so that's why you see this in Kotlin for Java and Swift for Objective-C, but why not TypeScript? 

And I think this is kind of confusing to a lot of people. It seems like if this is such a better path, why wouldn't TypeScript use it? I think the thing you have to realize is that so much of the JavaScript in the world wasn't compiled. Even though we have that compiler technology, if you haven't been compiling your JavaScript code, you don't have it available. And so they needed to reach back to the superset model. One of the interesting things we're doing with Carbon is leveraging the fact that we do have compilers for all the C++ code. We have a rich build system there. We can take this kind of more advanced approach, which it opens up more freedom. It gives us more opportunities.

[0:07:34] KB: Let's maybe dive in a little bit deeper on what that looks like. I think the one that I'm slightly more familiar with is the Java Kotlin example, but there you have the JVM and you have Java bytecode as a sort of target. And so long as you can compile down to that intermediate language, now you've got an easy form of interoperability. What does that look like in, you mentioned, I think Swift and also in the Carbon world? 

[0:07:58] CC: This is a great question. And I don't know that I know enough about. I'm not a Swift expert. I don't want to speak for that community. 

[0:08:03] KB: We can just go in Carbon too. What does this look like for Carbon? 

[0:08:05] CC: Yeah, in Carbon, we can really dive in there. Superficially, it seems like we have something similar because we have LLVM. And LLVM is kind of this very nice base layer that all these languages sit on top of. And so if we compile down to LLVM, things seem happy. The interesting problem with that is that LLVM is way too low-level. The JVM actually exposes the vast majority of Java semantics. 

If you can get to the JVM, you can actually interoperate pretty well with Java. But LLVM doesn't expose any C++ semantics. I don't know if you've looked at the LLVM, like the intermediate representation of LLVM for some C++ code. It looks nothing like the C++ code. It's gone, right? And so we need something other than that. And that is actually a real challenge. I think we couldn't have done this without the technology that we got in Clang, which is a complete production-quality C++ compiler, right? We use it for basically all C++ code across the entire company of Google, all the platforms, they all use Clang. And this compiler gives us not an intermediate representation or a bytecode like the JVM bytecode or LLVM's IR, but it gives us some way of understanding every aspect of the C++ code. Basically, it's the C++ AST that Clang builds. 

We can't lower into that though, it's too specific to C++, but we can use that to build bridges. And so what we do is, on the Carbon side, we're going to basically look at the C++ AST, figure out how that should kind of manifest inside of Carbon, right? Synthesize anything we need in order to kind of realize that interaction with C++ in the Carbon side and then use a combination of both Carbon and Clang that both of these compilers kind of lower that into LLVM IR to actually implement that interaction. Use Clang to get the fidelity, to get the language semantics, the nuances, the details, and then we lower into the common layer of LLVM for kind of execution and behavior. 

[0:10:06] KB: When you're compiling Carbon, is that also targeting an AST and it's a different type of thing or like - 

[0:10:11] CC: Oh, this is a great question. It's also a super deep topic because it isn't. So we have a pretty novel architecture for the compiler in Carbon. It looks completely different from anything in Clang, from anything in like your textbook compiler. I'm only aware of like one other modern compiler that's even going in the same direction. We actually both ended up going this direction in parallel. It's the Zig compiler. 

We don't have an AST at all. We have your traditional lexer that produces a stream of tokens. We have a parser that doesn't have any semantic information at all. It doesn't even know what the language constructs are. It's just basically the grammar productions and it turns the token stream into a very naive parse tree. But we don't ever build a semantic AST the way you would expect from a programming language. From that very syntactical and structural parse tree, we generate what we call a semantic IR. It's kind of like an LLVM IR, but also the opposite, where LLVM IR basically throws away all of the semantics other than the execution model and lowers it as far as it can, as quickly as it can. We keep everything. And so it represents every nuance of semantics, exactly how all of the things work. Everything you would expect in an AST is actually in the semantic IR, but it's represented as kind of an executable IR. It's something you could evaluate or lower into LLVM's IR. 

And this is a really fun model. It makes it really easy to lower into LLVM IR. It's now a much more direct lowering step. It also makes it really easy to do stuff like compile-time evaluation. And compile-time evaluation and metaprogramming have ended up becoming absolutely pervasive in modern programming languages. And so having this representation that tailors itself to those needs gives us a ton of advantages. It also is a lot easier to design this in a way that's fast on modern CPUs. And so one of the big goals of all of Carbon is to have just the fastest compile times that are conceivable across the board. 

[0:12:11] KB: So then putting this together, if I understand, if you're wanting to integrate something, Carbon and C++, what you're going to do is you're going to use Clang to compile the C++ down to AST level. And you take that AST over, pass it off in some way to the Carbon compiler where it will translate it to the Carbon IR in a way that can interface. You then also compile your Carbon down to the Carbon IR. That's the layer at which they're going, and then you lower that into LLVM. 

[0:12:39] CC: Yes. And it's more complicated though. 

[0:12:41] KB: Okay, keep going. 

[0:12:41] CC: You do all of that, but you also in parallel have to build synthetic ASTs in Clang to represent all the Carbon constructs. 

[0:12:49] KB: To represent the Carbon. Ah. Yeah, it's bi-directional essentially. 

[0:12:53] CC: Bi-directional. And so you have basically two compilers going through. And on the Carbon side, it's all in this semantic IR. In the Clang side is all this AST, and they're each kind of telling the other, "I need this construct or I need that construct." These are basically like very - they're veneers, right? They always kind of dispatch back into the actual implementation on the other side. But this builds this bidirectional communication. 

Both of these compilers then emit a bunch of LLVM IR, which we merge together, and we get the output, right? And this lets us have almost arbitrary Carbon semantics exposed in C++ also arbitrary C++ semantics in Carbon and even better, it lets you bounce back and forth between them. Because you can't like say on one side of this, if you have a Carbon type, you want to like pass it to a C++ API, that API is going to instantiate a C++ template on that Carbon type. And that template is then going to call Carbon methods. But those aren't just normal methods, they're going to be generic methods that it needs to pass C++ types to. And that's going to go back into the Carbon side to figure out how that generic treats that C++ type and it's going to discover a callback from C++ in that type that it has to call back into the Carbon. It goes back and forth. 

[0:14:04] KB: Is there any performance costs when you cross that boundary? 

[0:14:08] CC: I mean, there has to be some. This is a real abstraction that we're keeping, and this is actually the thing we want to do. We want to have an abstraction between Carbon and C++, between Carbon and Clang. We want that abstraction to be fairly rigid so that we can manage the technical debt that we inherit from C++ because we are worried about that. But we don't want it to be a small abstraction. We don't want it to be an expensive abstraction. And so we're trying to make it as inexpensive as possible. We're doing everything we can architecturally to let that abstraction be as near zero cost as it can be, but it can't be free. 

[0:14:43] KB: When you are passing data across these layers, are you able to do it in a zero-copy way and just pass the pointers around, or do you have to do copies to get it into the new domain? 

[0:14:55] CC: No copies. It has to be zero copies. We have basically the same memory model. For all of the up, we have to be able to realize kind of the identical ABI and memory layout on each side. And so C++ types, even when we use them in Carbon, they have to be laid out according to the C++ rules. Carbon types, even when C++ is using them, have to be laid out according to the Carbon rules. And this really does dive into like why this is such a tight coupling even though we try to maintain this abstraction. The coupling isn't loose, right? It's a deep, deep impact. And this also means we have some constraints. 

We don't expect to be able to make perfectly exposed Carbon types to C++ code that we don't compile. Because we need to tell the compiler how to do fancy layout things potentially, things that don't fit into the C++ model at all. We're going to have the best support we can for kind of pre-existing C++ binaries, but there are going to be much sharper limits there. 

The best experience and kind of our top priority is you can compile your code with Clang, right? Which means that we can compile your code with this kind of joint Clang plus the Carbon compiler and actually give a custom semantics, custom ABI, layout everything just right so that there's no copy. We don't expect any realistic runtime overhead here. The overhead we expect is compile time, right? We want Carbon to be a really fast compile time environment, but we are going to take a compile time hit when we reach the C++. We kind of have to. 

[0:16:23] KB: All right. That makes sense. Any other nuances related to this sort of interop layer and how these things are working together? 

[0:16:32] CC: I mean, there are so many nuances here. I think one thing that is really useful to think about is what does it take to do this kind of interop? Because I don't think it's obvious when you're kind of looking at it from outside the programming language or the compiler world. People always like to think about really complicated things. Just think about simple things. What if you have a boring, old-fashioned abstract class in C++ with virtual types? We're doing object-oriented polymorphism. I know it's not a trendy thing, but it still is a thing that people do. There's a lot of code out there that does this, right? 

[0:17:01] KB: Yeah. I mean, if you're maintaining millions and millions of lines of code, I'm sure some of them are going to be utilizing this. They were written a while ago. You don't get that overnight. 

[0:17:10] CC: You don't get it overnight, right? And so you've got these things in there and you need to do this. Well, this means that in order to use it, you've got an API and it's going to accept a pointer to this abstract base class. And the API documentation is going to tell you, "Well, you call this by deriving from that, overriding these virtual functions with your specific behavior and then passing me your object." Right? That's how my API works. 

But think about what this means. This means that in Carbon, we have to be able to derive from a C++ base class, get the virtual functions, override them in Carbon with virtual functions. You can call using the C++ V-table, because there's some code that never saw your Carbon instance. You can't customize this, and then you have to hand that back to C++. And the Clang has to do the right thing, even though it no longer knows there's a Carbon type under the hood. It just has an abstract base class pointer and a V-table, and it's like, "I don't know, I'm going to do a virtual function call. It'll get there eventually." And we have to make all of that work. 

And so, okay, there's a lot of compiler infrastructure needed just for this, right? There's also language implications. This means your programming language needs to have inheritance, right? Rust doesn't have inheritance. Go's inheritance doesn't look anything like this. What language are you going to be able to model this kind of dispatcho? This is one feature. We can keep going through all of them. Every aspect of Carbon's design actually has to look at C++'s design and think about, "Okay, what features need to show up in Carbon from C++ through interop? What's the model we need to have for this. How do we make that work?" And this adds so much complexity. 

If you look at modern programming languages, basically, none of them have overloading or variadics the way C++ does. The only modern programming languages even tried to do variadics in recent years as Swift. They have an incredibly cut down minimal version, and they added it in what Swift 5, Swift 6? They didn't add it in Swift 1. We have to have variadics and complex, wildly fancy variadics on day one, because half of the C++ APIs we want to interoperate with our variadic APIs. They're stood format and all of its friends, all these different APIs that use all of these C++ features. This means we have to start off with both a very specific feature set and a very large feature set to have any hope of success. And that is a real challenge. We've been working at this for a few years now, because it's so hard to build up this body of design that can actually support the interop. I think it's hard for people to see the design impact there. 

[0:19:50] KB: Yeah. Well, and this leads us into another area of conversation that I think will be interesting, which is about the language itself, right? There's a set of things that you have to implement in order to have C++ interop. But also, part of your goal here is to get away from some of the C++ -isms and build in some new modern things. What does this language bring us? 

[0:20:11] CC: Great question. So there are a few things. We've already talked about one of them actually. One of them is fast compile times. I know I harp on it a lot. It seems like it doesn't actually matter. We have like build servers and build - it matters so much. It is so painful to sit there. Okay, back. We measure compile time for C++ code in minutes. I want you to think for a minute. 

[0:20:36] KB: All of you are using ViT that get your millisecond turnarounds in the browser, seeing the UI difference. Yep. 

[0:20:42] CC: Let's think what is a computational activity that we do in the modern world on modern computers that takes minutes. 

[0:20:51] KB: I mean, look, we're in the LLM world now, so there's more and more of those going on. 

[0:20:57] CC: Asking like this LLM, this billion variable LLM to synthesize like the work of Shakespeare or some nonsense, whatever it is that we're doing. A video. Right? Synthesizing a video from a neural network is the kind of thing that takes minutes. We're compiling text. We're compiling like a few 100,000 lines of text and we're waiting minutes for it. They have played us for fools. This is absurd. 

And so I get really upset about compile times. We're not even in the right order of magnitude. We shouldn't be measuring compile times in minutes or in seconds. You said we should be measuring it in milliseconds. We should be measuring it in things that are kind of interactive speed. We need to get down to that. That's the first thing we want to bring. 

But if we look deeper into the language design, one of the biggest things we want to bring is we want to bring some path to kind of the modern requirements of programming languages, right? The big one is memory safety, but it's interesting to kind of decompose what do you actually need to achieve memory safety, particularly for high performance? Either you have to sacrifice performance and have a garbage collector reference account or something like that. Or to get memory safety and all the performance, Rust has really taught us what we needed here. We need a parametric type system and we need to be able to parameterize the type system with all of the information that kind of enforces correctness, enforces safety of lifetimes, and ownership, and all of these things, right? 

But that parametric type system is kind of the elephant in the room when you get into the PL space, right? Because C++'s parametric type system are C++ templates. And templates aren't super fun. They don't give you good error messages. It's hard to program with them. They don't have API boundaries. And they also have incredible code size costs and compile time costs. Every programming language that is trying to achieve this kind of guaranteed compile-time memory safety is doing it by having definition-checked generic programming facilities. 

Rust has generics that are checked at definition time. Swift does. They're adding ownership and the lifetime things to Swift now. If you look at the predecessors of Rust, all of them modeled this in the type system. And the nice thing about doing this without templates is they can actually avoid generating a different version of every function for every lifetime equation that it ever shows up in. We already have enough duplicate code in C++. We don't need to multiply that by 10 or more to get memory safety. But that means we have to go back to one of the most controversial features that was ever attempted to be added to C++, the original C++ concepts back before C++ 11. 

Doug Gregor, the eventual designer of Swift, tried to bring this kind of definition checked generic programming model to C++. It hit a lot of problems and eventually it hit just insurmountable problems. And insurmountable problems that hit were tied to backwards compatibility issues. The entire design of the C++ standard library needed to change and evolve to leverage this new type system, right? And that was too much of a change. It broke everything. It would have set us back another 10 years. And so the committee had to pull it out. We got concepts, but we didn't get the concepts that actually give us this feature. We got concepts that work with the existing design of C++. That's great for that design. But it doesn't give us the parametric type system that memory safety needs. 

The very first thing and probably the biggest single thing that we're adding in Carbon is this idea of definition check to generics. But because we're coming from C++, we don't get to just add. We also have to innovate. We have to build not just a definition check generic system, but we have to build one that includes templates because of interop. We have a generic system that can both do definition checking and allow future kind of lifetime memory safety facilities and supports raw templates instantiated just like in C++ with all of the quirks, and oddities, and behaviors that kind of are necessary to support interop. 

But now you get to start moving away from those quirks. You get to incrementally say, "You know what? I actually do have an API that this generic conforms to." You can remove the template from it. We'll check it. We'll actually give you code size benefits. We'll give you ergonomic benefits, the beautiful error messages. All of that stuff will start to arrive and you get to adopt that incrementally. And I think it's going to be absolutely amazing to have that for library authors, for anyone writing generic code to be able to move out of the template world and into kind of this checked generic programming world, but not leave all of your dependencies behind is going to be amazing. And I think that's probably the single biggest - there's a banner feature that really we already have, it's going to be that one. 

And the one we're working on next - and we just kind of announced that we're actually getting ready to start on designing the memory safety pieces on top of that. And so over the course of this year and next year, I think you're going to start to see how we're going to layer memory safety as well. And so you both get this more powerful type system and then more powerful safety system as well. 

[0:26:00] KB: Can we maybe talk about what that migration path looks like? Say somebody's built out a bunch of stuff in templates and they want to start migrating over to Carbon generics, like what does that look like? 

[0:26:09] CC: Sure. The first thing is to use something like Generics, you need new language constructs, right? And the whole reason we're doing Carbon is that we didn't want to have to try and fit all of these new constructs into C++. Technical debt, the complexity, it didn't make sense. So the first thing you do is you basically take your existing design as it is and you just translate it into Carbon, right? This doesn't give you magically better everything. We can detect easy cases and be like, "We can prove that this didn't need to be a template." We can do some easy cases. But we always have this fallback of like, "No, we're going to faithfully move your C++ code into Carbon." That's going to be fairly automatic. We want that to be kind of click button in the IDE as automated as conceivably possible. 

Once you're there, you don't have the better code yet. I mean, there are syntax and keywords. We can talk about that if you want. But once you're there, then you can start incrementally doing this. And so the first step is you can add a constraint to your template, right? And C++ concepts allow you to somewhat do this. But when you add a constraint to a template in Carbon, we're going to check it more thoroughly because we're going to basically check, "If this weren't a template, would it fail?" It's still a template. We're going to give you that behavior, but we're going to actually validate that yes, you actually abide by that constraint. 

As you add those constraints, you can then propagate them because you have to teach everyone who's using your template to satisfy that constraint. You can propagate them slowly because it's still template. It's okay if there's a layer that hasn't yet kind of documented that it satisfies that. As long as the template did work, you're fine. Once everyone does, you can remove the template. And now that constraint isn't just checked to be correct, it's also required. 

And now it can be required and we can leverage it. And so you basically have this free way of adding a new features that doesn't break anything. There's no ordering dependencies or anything like that. And once you kind of get all of a particular API to use that new features, you get to restrict it a little bit and kind of start leveraging the new semantics, right? And the important thing is this is incremental, right? There's one big bang thing, but that one doesn't do anything exciting. It's designed to be boring. And then all of the exciting pieces are small incremental changes that you can kind of do independently, API by API, almost line by line, right? And get to the point where you're trying to get there. And this pattern, by the way, is going to repeat. 

[0:28:34] KB: Yeah. No, I love that, that you're thinking about what does that migration path look like? And it is the type of thing you do anytime you're navigating legacy code, right? You're like, "Okay, let me get a nice clear boundary around this. Let me move it into the new place without any functional changes and then start incrementally bringing things up to snuff. 

[0:28:53] CC: Piece by piece. Great analogy, right? I love actually talking about legacy code maintenance, because that's really what inspires a lot of this. How many times do you like go look at legacy code like, "Oh." What's the first thing you do? Seriously, what's the very first thing you do, some really gnarly code you find? 

[0:29:07] KB: Is this tested? That's my first question. 

[0:29:10] CC: Okay, start writing tests. What's the second thing you do? 

[0:29:13] KB: The second thing - 

[0:29:13] CC: The first modification to the code you do? 

[0:29:15] KB: The first modification I would do is try to build an API boundary around it because - 

[0:29:18] CC: No. Before that. You reformat it. 

[0:29:22] KB: Yeah. Okay, fair enough, fair enough. Yep. 

[0:29:24] CC: Everyone goes and reformats it. They're like, "Oh, I'll run my format tool over this." Right? 

[0:29:30] KB: Yeah. No, that's a good point.

[0:29:31] CC: Yep. That's how easy we want it to be to start adopting Carbon. Instead of running your format tool, "Oh, this is old C++ code. I'm gonna Carbonify it." Now I've got it in Carbon. It's not beautiful code yet, right? But just like the formatter tool, at least put it into the pattern I'm used to. It gave me kind of the flavor I expect. Same thing. Now, okay, so this is just crafty Carbon code now. All right. And I have all the tools I need to start making it less crafty. 

[0:29:56] KB: Love it. Okay. Generic types, parameterized types, all these things. That's a big thing. What else? What else makes Carbon distinct? What are the modern language features you're bringing into it that you didn't have to bring from C++, but you want? 

[0:30:13] CC: So what are some big fun things outside of generics? Honestly, I think in terms of big things, if you look at this from the lens of C++, right? And what feels new coming from C++? I think generics and the safety things are going to far and away be the biggest new things. I don't know how many new things. We don't have any plans for other new things. Because fundamentally, the motivating thing for all of this is really memory safety, because that's the proximate problem bearing down on us, right? 

Generics are only even in there because we think they are a critical technical dependency of the only way we really know how to address memory safety at the highest performance level. Beyond that, we're not trying to innovate. In some ways, this is a weird thing for a programming language. We don't have a goal to be new in that way. Instead, what we really want to do is systematically make the existing features of C++ better. 

Let's look at inheritance. I want to go all the way back to inheritance. I know, maybe it's old school and all of that. But C++ inheritance is kind of a nightmare, right? You have inheritance, you have virtual inheritance, you have multiple inheritance, and you have distinct multiple inheritance without virtual inheritance and multiple inheritance with virtual inheritance. I don't know if you've ever played with those parts of C++. There are dark, bizarre corners here, and people use these things, right? They use them heavily and they really rely on them, but the tools are not good. The use cases are. It's the tools that we need to make better. And so we want to try and look at these use cases and just come in with a better version of the same tools. I don't think this is novel, but I do think it's going to make much better experience. 

And so we have single inheritance and card, right? Because that's the one that you can do kind of dynamic dispatch over in a way that's really easy and straightforward. But all of the uses of multiple inheritance, they are real uses. And so we want to try and recast the same core ideas. Just typically kind of designed by composition. Like, "Well, I want this type to have these four facilities." So I have this space class that I'll just like inherit from and they'll give me some methods and fields that provide those facilities. It's compositional design. 

There's a good way to do that in modern programming languages. They're called mixins. I mean, you can say that we're adding mixins, but we have mixins in C++. That's what multiple inheritance gives you, are mixins. It's just really, really painful. What we're going to do is we're going to have mixins. We're going to target exactly the use cases we see on C++, make sure that they work - they're easy now, they're clean, they're unsurprising. Everything is just like, "Oh, well, yeah, of course, that works wonderfully." Right? And that's going to repeat. 

Coroutines is another place where I think we have just so many opportunities to make the experience smoother, cleaner, more enjoyable for users. Error handling. I think we can do error-handling so much better. Here's a place where we may actually remove some things. I'm not a big fan of exceptions. I am a big fan of performance. Maybe we can come up with an error-handling system that gets the best of both worlds, right? That gives you kind of the best ergonomics we can manage, but also has the best performance. And that might actually be taking a little bit of some things and exceptions. Might be taking other things from kind of Rust style result types and kind of the monadic transformation that they do. 

There are a whole bunch of tools we can try and bring together. I don't think of these as bringing something really new to the language. Everyone can already do error-handling in C++. It's just a pain. We're going to try and make it easy, super, super friendly to users, really easy to kind of compose and use all over the place. And that's kind of outside of memory safety and type check generics. That's really the focus that we have, is just take what's already there, but let's make it principle. Let's make it understandable. Let's make it really easy and fun to use. 

[0:34:01] KB: Yeah. Well, and I think it makes sense, right? Still, there are so many hundreds of thousands of people writing C++ code because it feels a niche, despite how painful it is, despite how old and crafty the language is, despite how they have to grapple with multi-minute compile times. If you can capture all those benefits without the costs and bring the development experience up to the modern world, there's a huge opportunity there. 

[0:34:25] CC: Let's dive a little deeper. Coroutines and, I don't know, virtual inherit - all these are kind of like big features. I think maybe where there's more interesting things that even seem like that we're adding something as if you zoom way into the minutia. Because when you're talking about what are the use cases we support and, to your point, that we reach for C++ despite its paying for, it's for incredible high performance that is often a problem solved in the minutia, in the details. That's a place we can actually start innovating some. 

Here's something that seems like this should be a simple thing and the smallest thing in the world in C++. How do you pass an input parameter to a function in every programming language other than C++? There is one obvious answer, and it is the way you pass input parameters to the function. On day one of learning it, the language, the first function you write, you use the correct way to pass input parameters to the function, every programming language, but not C++. 

In C++, we have const references, which seem like a really good way to pass things as inputs, right? It says, "Hey, don't mutate this." But you get reference semantics, so you don't have to dereference it everywhere. You can even pass temporaries. You can compute a new value in the argument, and that's fine. It'll pass as a const reference, no problem. But there is a problem. Const reference under the hood is a pointer, and this is very fundamental in C++. You can't not implement C++ with a pointer for a const reference parameter. You can take the address of a const reference parameter inside the function. You can return that address to the caller. It's required to match, right? So you don't have any flexibility about this. It has to be a pointer under the hood. Pointers have cost. That means every time you access a const reference parameter in C++, you're indirecting through a pointer. 

Now, some of the time, this is exactly what you want. You had an enormous class, some data structure that you were passing in. You don't want to make a copy of it. You, of course, want to just give a pointer to the giant data structure you allocated somewhere and then access it. This works great, but what about integers? Or something that comes up every now and then in really high-performance settings like floating point numbers or SIMD vectors. You don't want a pointer to a SIMD vector that you're like measuring the exact cycle count of your CPU processing the data to hit a frame rate budget in a video game. That is not okay. It's not acceptable, right? 

You don't have a single way to pass an input parameter in C++. You have to think about the type of the parameter like, "Well, hold on. Is this like a small parameter that I don't want to have a pointer to, or is this like a big parameter where I want a pointer?" That tells you whether you pass by value or you pass by const reference. If that's not bad enough, I mean, I'm already horrified because you've just told every programmer that for the simplest thing to do in a language, pass an argument to a function, the simplest thing. You have to think carefully and make a decision. 

I don't know about you, but decisions are expensive. Writing code is not that expensive. Making decisions is really expensive, and we have to do it every function signature. But it's worse. It's not even that bad. It's much worse than that because of generic programming. I may not know the type of that parameter. How do I know? 

[0:37:49] KB: Yes, yes. How do I know which one to do? 

[0:37:51] CC: In fact, there isn't a correct answer in C++. I mean, so here's something very fundamental, and this is something we definitely can solve. We're going to have a way of passing inputs, parameters in Carbon. There's one way. It's also, by chance, the shortest way of passing a parameter instead of one of the longer ways with a const reference, and so it's going to be convenient, easy. It seems like such a small thing, but every one of these adds up in performance cost. One of the interesting things is, I think, when people start migrating from C++ to Carbon and start using Carbon, we're actually going to be able to deliver performance as well as safety, right? It's not going to be just one or the other. The performance isn't going to be because a magical new compiler optimization is unlocked. I mean, I would love it if we did that, but I don't have any magical optimizations left on the table. They're all out there doing stuff already on C++, but we do have our little things, and these little things, I think, really add up and the little things like getting your input parameters right. 

I don't know if you see - I gave a talk about std::unique_ptr in C++, which is like one of the most basic types. It's like I allocated some memory on the heap, but I own it. When I'm done with it, it has to be freed, right? Simplest type in the world. It already is a pointer to memory on the heap. But if you pass a unique pointer as a parameter, you don't get a pointer parameter. You actually get a pointer to a pointer. Again, could we fix this? Possibly. But fixing this is a backwards incompatible change. The standards committee and C++ are just not in a good position to do, right? I don't want to say that they're kind of wrong to not make these changes. They just have a different set of priorities. Their priority is backwards compatibility, right?

[0:39:33] KB: Absolutely. I mean, we see this all over the place in software, right? You see this in the new JavaScript run times, right? If you can ditch some backwards compatibility, you can build something that's way faster. I mean, actually, as you were talking about, though just one way to do it and how that enables performance, it had me thinking about Golang, right? Go does this very well, super simple, constrained language, very easy to write performing code. 

[0:39:59] CC: Yes. You don't have to think about, "Oh, have I made the garbage collector's job hard?" It's like, "No, no, no. It's made sure that the normal thing you do works well with the -" It's a beautiful language in that way. 

[0:40:11] KB: Well, in the kind of world we're in now where more and more of the actual code is being written by coding assistants, and we're the ones making the decisions and then telling it to do it, having one way to do the thing you want to do is also an incredible unlock in terms of productivity. 

[0:40:27] CC: Absolutely, absolutely. 

[0:40:29] KB: Let's maybe talk about a slightly different thing about this. This is an early stage project. As I understand it, it's an open source project, and I saw the governance structure, and I thought that might be interesting to talk about briefly, especially because I saw you have two big roles. I saw leads and a painter. I hadn't seen the painter before, so I kind of wanted to get your take on how is this being governed and what is that painter role? 

[0:40:55] CC: Sure. The first thing, just to back up and set a little bit of context about the governance, I think it's important to really understand what we're trying to set up to do here. A critical thing for a lot of users of C++ is that it has a real governance model. It's not owned by one company. It's something that they can rely on and that they can participate in that governance. They can contribute to. There are a lot of users on C++ who want their voice to be heard and be able to have a say in where the language goes. When working on Carbon, we really needed to make sure we were setting ourselves up for a success path there. I've been in a lot of open source projects. Every one of them, when they didn't start off with a governance model, and they tried to add it later on, it was chaos and tragedy and pain and suffering all around. 

Introducing governance late, which is of course, when you finally need it, is almost always a painful process. We wanted to try and address that, and so we decided to front load having a governance process and a governance system, even though Carbon is a tiny little experimental project. It probably doesn't need half the governance infrastructure we've put in place, but now is when it's easy for us to put it in place. So we've done that, and the whole thing is we're really aiming for the long-term here. We want to be set up in a good place, and we are really aiming to be open source in a very fundamental way. We want this governance to be one that can evolve, that can be increasingly community-driven and not driven by any one company. We want to have a clear-cut and explicit path to get there, rather than trying to do it after the fact or, worse, not doing it because we didn't plan for it. That's kind of what led us to the governance structure. 

The flip side is we do need the governance structure to move fast. It is still an early stages project. We don't have the luxury of having a very slow-moving deliberative kind of governance model. The balance we struck here was to basically have three leads. Don't want to have the [inaudible 0:42:54] dictator for life model. Setting aside other problems with it, it creates a single point of failure. If someone's out sick, how do you make progress, right? We really don't want to get stuck there, and so we're like, "What is the least amount of overhead we can have, while not having kind of this critical failure mode?" We tried, actually, in the earliest days of the project. Tried setting up kind of a more committee-ish structure with eight people, and there were votes, and there's rules about how we make progress. It was slow and burdensome, and it didn't work. Even at the tiny scale, it didn't work, so we kind of distilled it down like, no, there are three leads. 

That's enough to rapidly get to consensus, right? It's enough for us all to develop a really strong rapport with each other so that we can iterate rapidly. We should have incredible trust. Usually, one lead is enough to make a decision. The other leads are available. They'll push back if it's the wrong decision. But we don't even have to necessarily synchronize if we're like, "No, this is an uncontroversial thing or an area we've talked about." That makes it super rapid. It even makes it better because now if one person's making decisions, then you go on vacation, decisions keep happening, right? It ends up working out really well. 

Now, to make it last long-term, one of the tricky things is you can't just have the same people be leads forever. But, again, in the early stages of a project, you can't just rotate leadership and expect to ever get to a good result. What we've basically done is we said in our leadership, in our kind of governance model, when the project reaches the level of maturity, we're going to institute term limits, and we're going to basically start rotating through the leadership role. We said that that can't arrive after a particular milestone. So we don't really care when it starts, but we have a bound, an upper bound on how long before we start rotating. 

Then we'll probably do nice overlapping terms so that we're not churning all of the leads all at the same time. Rotating through them will end up building up an emeritus group of leads so that if someone has a family emergency or a change of job or something, we can just rapidly get an emeritus lead to step back in. We're going to allow people to come back right after they leave. It's also the stakes are lower, right? People feel comfortable stepping down because, hey, they're going to get to go and recharge for a while and then come back in if they're still super motivated here. That's kind of the how the structure evolved and how it tends to work. 

All the decisions we make are public. We always also have to switch hats. So the leads are often technical experts in the community, but they can't be technical experts while they're making a decision. If they have an opinion about the right technical direction, they have to go and state that publicly as part of the discussion about that decision first. Then we can kind of consider that statement as leads and maybe make a decision that reflects it. That forces us to be open to engagement with the broader community, and it defends a little bit of the small size of leads because, honestly, there's not a lot we can do as leads. Mostly, we just have to participate as technical experts. 

Then when there's a question and all of the positions have been aired, they're all written down, everyone understands them, we have to make a call. That's actually the only thing the leads really do. Even that's a little bit too much. There are a bunch of calls where it doesn't matter. This always bothers people in programming languages because people get really attached and everything matters. The comic introducer character matters to some people. I get it. It's near and dear to your heart. You see it every day, all day as an engineer. I understand that. But realistically, we're not going to have weighty decisions every time. That's where the painter kicks in. 

At any point, the leads can simply say, "We have a set of options, which there are no technical reasons to choose between, right? Maybe we've discarded a bunch of other options for technical reasons, as lead's fine. But there's some, and it's not one. There's multiple options left, and there's no technical reason to pick one or the other. When that's the case, they can ask the painter, and the painter is basically just a tiebreaker there. All right, go left. Paint the bike shed blue, right? You've gotten rid of all the paints that are going to run, that are going to not weather well, the paints that cost too much. You've gotten involved with all of those, but you still have three different colors. The painter's there to be like blue or whatever color is appropriate, right? For purely aesthetic things, we do have to make decisions still. That's where the painter role kicks in.

[0:47:19] KB: I really like that because it eliminates one of the most common failure modes I've seen in these sort of consensus-based decision-making structures is when you get to the point where there really isn't a best answer, and you're just like, "Well, how do we decide," and this person has one opinion, and that person has another opinion, but there's no - it's nothing but an opinion. 

[0:47:41] CC: Here's the funny thing. Painter doesn't make any decisions. I was stunned by this. I mean, I put myself as the painter. I freely admit I expected to get to make a bunch of purely aesthetic decisions here, right? Almost none. So close to zero, it staggers me. That's not the actual effect it has in practice. In practice, what this does is it gets everyone to stop arguing the aesthetics and actually state the underlying technical reason why one option is preferable to another. It does so in a way that's so much less combative because if you try and tell people like, "That's an aesthetic argument. Stop making it," that doesn't work. You're like, "No, it isn't. It matters to me. I really like blue, okay?" It doesn't work. 

But if instead you say, "Well, if that's an aesthetic argument, then we have a process we're going to use over here," all of a sudden now they're motivated to figure out why they actually prefer one based on technical rationale. The amazing thing is we usually find technical rationale. We almost always find a technical reason why choice A is actually the best choice. It's not aesthetics. It's not zero, but it's really close to zero. How many times it's like, yes, either spelling is fine. So that was actually a really interesting thing to me. I think that the painter role is effective, even when it is never used. 

[0:49:09] KB: Yes. It creates a different set of structural incentives. 

[0:49:12] CC: Totally different set of structural incentives and ones that feel much more friendly. 

[0:49:16] KB: That's delightful.

[0:49:17] CC: You can be happy about being forced to find the technical underpinning because it's not like someone's devaluing your judgment. 

[0:49:25] KB: Yes. That's great. Let's talk a little bit about language status. I saw that a goal for last year, 2024, was having a working toolchain up through the C++ interop that we talked about today? Did it happen? 

[0:49:39] CC: No. I mean, you're asking, did a software project hit all of its -

[0:49:43] KB: Hit deadlines. No, I don't. Yes. 

[0:49:45] CC: The answer is, of course, no. No, no it didn't. As per usual, our ambitions exceeded our capabilities. Everything is harder than it looks when you start. We did get a working toolchain. We have a remarkable change in toolchain. At the start of 2024, the toolchain was as nascent as you can imagine. We've done most of our work actually prototyping in kind of an interpreter demo space, instead of in the toolchain. So the toolchain was just almost non-existent. 

At the end of the year, one of the main developers on the toolchain actually got through, I think, 13 days of Advent of Code in Carbon. Now, most of them included patches to the compiler, which I understand is not really the ideal way to do Advent of Code, but didn't make it through. Technically, there was someone who did one Advent of Code challenge on the old interpreter demo in Compiler Explorer on the website. Writing TypeScript code to upload the Advent of Code problem data and extract the result. I don't know how they did this. That was amazing. But we got 13 Advent of Code days done on the toolchain with a very minimal number of compiler patches. 

It's actually a very functional toolchain from a toolchain internals perspective yet. But we didn't get to the point where it's starting to be more functional for users of the language or people evaluating the language, largely because we still need to interop. We got interop started, but we didn't make much progress on interop last year. By started, I mean, we taught the toolchain to have clang alongside it. 

[0:51:19] KB: Yes. As we got into, right? There's a lot of detail and nuance that goes into that. What are things looking like for this year? I saw the next target, big target beyond that was 0.1. Or is 0.1 - when you can do that, what does the next three, six months look like? If somebody's excited about Carbon, when can they start digging in?

[0:51:41] CC: It's important to understand the sheer time scales here. There's no way that 0.1 happens this year. I wish it could. There's just way too much to do before we can really call it 0.1. If you look at our 0.1 definition, it's actually very ambitious. We want 0.1 to be kind of close to production quality for the purpose of evaluating Carbon within a real C++ environment. I have a real C++ codebase and a real use case, and I want to try it out and see how it works in practice. There's tons that we need to do to get there. 

This year, our goals are twofold. One, we're going to try and get C++ interop from the nascent state, right? Where like we can load C++ and build that ASD all the way to actually doing interop specifically. We'd love to do interop without templates. So no template instantiation, but like that fancy virtual inherent, like virtual dispatch stuff, moving objects back and forth across that boundary, that's our real goal for interop for this year because that's going to get the foundations of interop in place. This includes incredibly hard stuff, right? Think about types with operator with not just function overloading, operator overloading. 

One thing we want is to actually map your operator plus overload in C++. That's right, A plus B on your special vector type. We want you to be able to write A plus B with that special vector type in Carbon, and that Carbon plus maps all the way back through to find your C++ plus. This is actually still pretty ambitious. Then the second thing is we've gotten feedback already that we really need to accelerate telling a concrete story about how memory safety fits into this picture. So far, it's been fairly aspirational, and there's a lot of hand-waving. We've got to make it very concrete, very specific. 

We probably won't get past slide ware describing the design for memory safety, how it works, how you can migrate towards it. But we at least need to get to that and a very concrete version of that by the end of the year. The team's also grown a little bit, and so I think we actually have the ability to pursue both of these over the course of this year. 

[0:53:50] KB: Awesome. Well, we're pretty close to the end of our time here. Is there anything we haven't talked about that you want to make sure folks are left with? 

[0:53:58] CC: We've talked about all the amazing and good parts of Carbon, but what about the bad parts? A lot of people, I think, are a little nervous, like raising bad parts. But I actually - I think these are good to talk about, too. I can tell you the bad parts I'm hearing. But any bad parts, I think, would be great to talk about. 

[0:54:14] KB: Yes, dive in. What are you hearing about? 

[0:54:16] CC: One that I hear all the time is, "Why on earth did you change the syntax so much? What is this fn keyword nonsense? Why are there colons everywhere?" People are really frustrated by that and a couple of things. First off, you're not alone. I'm actually really frustrated by a couple of the syntax choices myself. This isn't exactly how I had hoped Carbon would end up looking. I think people are really frustrated because it seems unmotivated. It's not that the syntax changes are necessarily bad, but it seems unmotivated. I think there's actually an important thing to address. 

First off, there are two parts to any motivation. You've got cost and benefit. The first thing, I think, people mistake is the cost. Syntax is relatively cheap. Now, I don't mean to say it's free. Some people are like, "Oh, it's just like the compiler handles it. Syntax is free." No, humans also have to read code. Asking humans to see different keywords is a cost. But mapping from one set of keywords to another in fairly similar positions or adding keywords to a structure that is already relatively similar is something that humans can do it fairly easily. If you look at programming languages that have done that, it didn't cause massive problems, right? 

The complaints about Python 2 to 3 were not new keywords in existing structures. It was that you needed to change the structure in some way, right? While I understand that there's kind of an emotional, immediate reaction when you see this, we don't have any evidence that the cost of this is going to be especially high for the humans. But what's the benefit? The benefit is, I think, really surprising. We keep finding places where having the simplified syntax, the simplified grammar actually helps make our compile times faster. A lot of people are like, "Parsing is a solved problem. None of the compile time is spent there." Yes, a little bit. But also, no, these aren't as separable of problems as they seem on the surface. 

We find that the grammar structure of the language, we end up using it inside the semantic analysis that's doing type checking. That's the expensive part of the compiler. When we have the wrong structure, we have to waste time and waste resources actually doing that checking. I think it's a little bit more important than people see. So the big thing I would love to say is just like we do hear you. We are aware of the cost here. It's just that when we really evaluate it and when we play out for the long-term, the changes do seem to make sense. We're constantly evaluating that trade-off. We may even have the wrong trade-off. 

But a fun thing, we used to actually have a different grammar in an important way. We used to actually have - in variable decorations, you do variable type colon name equals and your initializer, which is a little bit closer to C++'s type, space name equals initializer. But when we put the colon in, and that was the first thing we did, that was to make the compile times faster. It actually significantly improves how easily this parses having that colon in there. But when we did, it started to look like declarations in TypeScript and in Rust and in Swift and in all of these other languages. But in every single one of them, the type and the variable name are in the opposite order. The variable name comes first, then a colon, then the type. Lots of good reasons why they all evolved that way. 

I was very reluctant to make this change, but we finally decided like having a colon but being different from every language with a colon was even worse for programmers than having the order the same way around. It was a hard trade-off that we made. Every time we're going to keep making those trade-offs, we are going to think about it as hard as we can, and we may even change our minds. If Carbon 0.1 comes out and everyone's like, "This language is great, but you all have to get rid of this fn keyword," we're going to get rid of this at that trail. We are going to respond to this, but we have to be careful. We can't just arbitrarily copy syntax from C++. We have to think about each and every one of these decisions. 

[0:58:25] KB: When it sounds like, if I'm not mistaken, those arguments are happening in public, people can go and see the ways that you're deciding those trade-offs and kind of the justification of, "Hey, this is aligning with all these other languages that have this type of syntax," or, "Hey, this is improving our performance on the parsing step by this much, which has these impacts." 

[0:58:42] CC: Yes. There's a public issue on GitHub from when we switched the order of type and variable name, where this is specifically called out as what motivated us to make that change.

[0:58:53] KB: Well, and I harp on this a lot, but we are in a new era when it comes to code generation tools. The more it's going to be similar to what other languages expect, the more your generation tools are just going to work, and it's going to be easier to adopt not only for programmers coming from non-C++ backgrounds, but also for programmers adopting tools trained on a wide range of languages, all of which behave one way. If yours is the odd one out, it's going to -

[0:59:22] CC: It's going to be awful, right? There are lots of other complaints, but actually we actually touched on most of the other complaints. I think a lot of people are worried. This is vaporware. Is this ever going to show up? We talked about the timelines there. A lot of concerns around this interop thing. It can't work. You can't actually make this work. That's actually why interop is such an urgent goal for us, why we're prioritizing this highly as we are. We think that really we have to show our work here. We have to basically show a working version of this for it to be compelling and believable. We understand that we're going to actually come there with proof in the pudding. 

I think memory safety, like why are you not thinking about memory safety, we are. I think maybe one interesting thing is a lot of people don't understand why we started with the memory unsafe version of Carbon. It's a weird choice in this day and age to start there, and I actually agree. It is weird. The key realization for me was that all of the C++ code we need to interop with is also unsafe. It's not that we have the flexibility of choosing here. We have to start off with the primary reason we exist, which is interop with C++. We have to make sure that that works. We still have to make sure we can add memory safety to this. If we can't have memory safety at all, none of this is going to work. All of our programming languages are going to end up having to have some kind of memory safety. So it is absolutely a necessary goal for us to get to. 

But if that were our only constraint or our primary constraint, that's why people are using Rust. Rust is a great option already. That's why people are using other languages. They start with that constraint. The whole idea of Carbon is what if we start with the constraint of building off of C++? That means fundamentally building off of unsafe code and unsafe code patterns. We have to support those as well as we can, and then layer memory safety on top. I mean, we still have to prove that one out, and that's one of the things we're working on. But that's just kind of the underlying motivation. It wasn't like, "Ah, we don't care about this memory safety thing. It's overblown." Totally the opposite. It's that we care about how we get that memory safety on top of existing C++ code.

[1:01:19] KB: That sounds like a great final cut, so let us hit stop there.

[END]