EPISODE 1659

[INTRODUCTION]

[0:00:00] ANNOUNCER: Language and compiler design are fundamental aspects of computer science. High-level languages are how most developers interact with computers, so it's hard to overstate the significance of compiler engineering, or the aesthetics of language syntax. C# is a general-purpose, high-level language that was created by Anders Hejlsberg at Microsoft in 2000 and was open-sourced in 2014.

Jared Parsons is the Principal Developer Lead on the C# Language Team at Microsoft, where he's worked for 20 years. He joins the show to talk about how the C# compiler is developed, the compiler as an API, language creation as an art, the experience of open-sourcing C#, and much more.

This episode of Software Engineering Daily is hosted by Sean Falconer. Check the show notes for more information on Sean's work and where to find him.

[INTERVIEW]

[0:01:02] SF: Jared, welcome to the show.

[0:01:03] JP: Hey, thanks for having me.

[0:01:04] SF: Yeah, thanks for joining. Let's start with some basics. Who are you? What do you do?

[0:01:09] JP: My name is Jared Parsons. I'm the C# compiler lead and a member of the C# language design team. I work at Microsoft.

[0:01:16] SF: Awesome. Then, you've been at Microsoft for quite some time. How long has it been?

[0:01:21] JP: About a week ago, it became 20 years.

[0:01:23] SF: Oh, wow. Congratulations.

[0:01:26] JP: It’s really hard. See now the email for my 20 year, I had to find a photo when I started and it turns out, there weren't many digital cameras 20 years ago. I had to look really hard for one.

[0:01:36] SF: Did you go directly from school to Microsoft? Was that the path?

[0:01:39] JP: Yeah. I had interned over the summer and then I graduated and just came straight to Microsoft afterwards.

[0:01:44] SF: Awesome. Yeah, so C# was actually – it first came out when I was doing my undergrad in computer science. At school, I mostly programmed in Java and C++ back then. But whenever I was working as an engineer for companies locally on internships and stuff like that, they were all Microsoft shots, but it was all ASP, Visual Basic stuff back then. When C# first got announced, I was really excited, because at the time, it felt like a much better design version of Java. Then it was also a giant leap forward in terms of where we were with web languages around ASP and Visual Basic.

It's probably the only language I've ever read the actual technical spec up for, and even did my programming languages project focused on C#. Now, of course, it was all 20 years ago when you were just starting at Microsoft. I haven't really done that much work in it outside of the first five years of existence. I want to go back and look at some of that beginning, some of your history there. What was the motivation for Microsoft to first invest in C#, and then why continue to invest in it?

[0:02:47] JP: Yeah. The initial investment predates me. I actually started as an undergrad using C# 2 and then ended up moving into Microsoft, in the .NET team a few years later. I know a lot of our initial reason for doing it is if you go back in time to when C# was invented, Microsoft primarily had two languages, Visual Basic, which was very much on one extreme of languages and C++. There was no middle ground for how do I build business applications that can be a little more low level, that can work across all these different versions of Windows, be easily deployable? C# really filled that void for them. Very much in the beginning, it was a web and line of business type of language.

As for why we continue investing in it, I mean, for a long time, it was a very successful line of business application. Most Microsoft gooies from that time going forward, like Visual Studio, they're all very much C#, .NET based languages. These days, though, one of the reasons we invested in them a lot is frankly, just performance. With the shift away from .NET framework and putting a lot of our emphasis on .NET Core, which is a much more X copy deployable, you can have multiple appointments on the machine type of framework, it allowed us to invest a lot more in performance. Because one of the big problems we had with .NET framework was the burden of breaky changes.

Turns out, if you have a framework that is shipped on 4 billion devices roundabouts and any change you make to it and immediately shipped with 4 billion machines, the breaking changes get noticed by someone in their back. It really inhibits your innovation. When we move to .NET Core, and we can let the customers control their deployment. When do I move to a new version of .NET, or C#? They can absorb breaking changes a little bit better, and we can start investing in things like, performance.

A lot of Microsoft services today, even within Azure, are written in C#. As pretty much with every release of .NET now, as we move forward, we can really focus a lot on performance and reduce those cogs costs for those services. That can have massive multiplier effects for the company, where it's just like, in November, we release a new version of .NET, a lot of services get cheaper to run for us and for our customers.

[0:05:01] SF: In terms of the breaking changes, how does that hinder the development of the language?

[0:05:07] JP: Yeah. There's breaking changes on a couple of levels. There's both the how .NET does breaking changes when the runtime, how do the libraries do it, how do they change the semantics within performance, stuff like that. In terms of the language, which is where I'm more centered up, breaking changes is a very big deal. One of the things I drive home for the compiler team that's very much on my mottos is the number one feature of C# is compatibility. It's like, we very much want the experience of you are not afraid to move to new version of .NET. You're not afraid to buy a new version of Visual Studio, because you know your code is going to keep compiling. We will not break you. We will make sure that unless you have done something absolutely extreme, it's just going to work. That is indeed our number one feature.

At the same time though, that does push an enormous burden on us. It's something that when we're in language design, we spend an enormous amount of time thinking about it. It's like, how can we design these new features, which go in and solve all these other problems that do not disrupt our existing ecosystem? It is, we have both an enormous amount of discipline and how we design, how we test, how we preview and how we ship to make sure that we keep that mentality.

[0:06:19] SF: Yeah. I mean, when I think about in the API world, obviously, any breaking change there is a big deal, especially if you have widely – like, it's widely used API. Essentially, you're like signing a contract in any ways about how this thing's going to work. Languages are the ultimate contract, because it's suddenly a function that I called that's core to whatever it is I built, like work substantially different, or goes away, then it's hard to know what's going on there, but it's also completely detrimental to the product that I've built. There's also a new learning curve that will be experienced by the organization to learn whatever those changes that were made that are no longer backwards compatible.

[0:07:00] JP: Yeah, it is a very much a big problem. The good thing though is these days, we actually have a few tools to help us out. If you shift back 10 years in how .NET worked, whenever you got a new version of .NET, everything was latest. You get the latest version .NET. You get the latest version of the language. That meant the breaking change bar, there was no lever for us. It's like, any decision we make to change behavior affects everyone when they move forward.

About seven-ish years ago, with the movement .NET SDK, what we did was C# has always internally understood what version of the language it's operating on. It has a version tick under the hood. What we did is whenever you move the way you move forward in .NET, if you say, “Hey, I want to move from .NET 2 to .NET 3 to .NET 5, and so forth.” Under the hood, we just moved the language version along with that new version of .NET. That's very different from git. You might get a new version of our product, or a new version of our SDK, that changes nothing.

When you go into your code and okay, it's time for me to move to the new version .NET and I'm going to switch something in my code. Under the hood, we move that language version forward. It just ticks with the version of .NET. That has been an enormous lever for us to basically say, okay, we found a particular behavior, how switch expressions work. A very good example is in .NET 7, we changed very much how lambdas, type inference around lambdas and overload resolution around lambdas. That is an enormous breaking change in terms of how very quarter cases of overload resolution work. We knew that would break customers. There was no doubt. It’s like, we're going to make some changes. A lot of things are going to get better. We're going to break some people.

Because we had this lever at our disposal now, we just said, these new rules don't take effect until you're on .NET 7, or higher. That led us to have an experience where we could do these breaking changes that were overall good and moved the product language and ecosystem forward, but didn't impact any of our existing customers, until they chose to take the journey with us. Then we had a lot of guides for like, if you're one of the people who hit one of these corners, here's how you get yourself out of that corner. It kept that model we want of, don't be afraid to get a new SDK. Don't be afraid to get an individual studio. The code's going to keep working. The breaks we introduce will happen when you move forward to .NET. There's a lot of emphasis around that and help to get customers over that hump.

[0:09:20] SF: What's the cost, though, to the team at Microsoft to one, I imagine there's a certain amount of bloat that comes with trying to maintain backwards compatibility, or maintain older code that's there for certain versions of the language. Also, how does bug fixes work? If there's a bug with an older version that doesn't exist in the newer version, is that something that you actually spend time fixing? Or is it actually, the workaround for somebody is they'll just need to progress with the language?

[0:09:50] JP: Most of the time, we do bug fixes that affect all versions of the language. People come to us to say, “Have a bug.” We're like, “Yup. That goes all the way back to here, and we'll fix it.” Ideally, what we try to do is, for most bug fixes, thankfully, a lot of it is, my code doesn't compile. Fixing my code doesn't compile is, except in extreme cases, not a real and meaningful compat change to your customers. Fixing those, there's no figure for those. It's like, okay, yeah, this didn't use to compile. You're right. We have a spec violation. Let's go fix that.

What becomes more devious is when you find bugs for code that does compile, that compiles incorrectly. Then we have to make a very conscious decision of, do we make this change and then affect anyone who's using the compiler? Or do we say, we want to tie this change to a new version of the language? You must actually move forward to new version .NET before you see this bug fix.

A lot of that just really comes down to experiencing gut feelings on we look at the scenario and we say, how likely do we think it is that how many customers are going to hit this? If it's a very, very tiny amount, just make it the blanket change and move forward. If we're like, “That's a pretty big scenario, we know a lot of customers hit that,” then we're much more likely to say, let's tie that behavior to a new version language. That does introduce by cost. We do have an enormous amount of test code, which is just what happens when you compile this code with this version language and that version language. It's very common in our code if you're familiar with X unit to see theories in our code where it's like, same code, this version language, that version language, what does it do? That's very common in our code base to see that.

Compat is an enormous burden. I say, we go through very big LINQs to make sure we’re doing it. One of my favorites is depends on the day, but between four to eight times a day, we change the C# compiler that builds Visual Studio. We rebuild the entire product. It's basically, as we merge pull requests in the GitHub, within about four hours, we've produced a new compiler. Then about eight hours later, that compiler has built Visual Studio. That is an enormous code base that has been developed over the entire lifetime of C#, and it has just a beautiful set of different styles and different eras of C# that we can then run our new compiler against.

One of my jokes is you haven't actually become a member of the C# compiler team until your code can get all the way through our rigorous GitHub pull request, but then break the Visual Studio build. It's like, that's when you've arrived and you've become a real member of our team.

[0:12:21] SF: Yeah. Well, you know you're having impact when you break stuff, right?

[0:12:24] JP: Yeah. It's like, okay, wow. It's like, this is a learning opportunity for all of us.

[0:12:29] SF: Yeah. Some of these products at Microsoft that are using C# is essentially the language of choice. How are those decisions made when there's a net new project, a new software project, how is the language choice actually? What's that decision making process?

[0:12:45] JP: Oh, man. I work in DevDiv. That's very much the developer tool section of Microsoft. We have a huge investment .NET. Pretty much when we do new tools, unless you're on the C++ team, it's pretty much going to be C#, or a .NET-based language. Outside of those, outside of DevDiv, I think the way it's approached is there's a number of languages which are approved for new projects at Microsoft. It's like, if you want to do C#, that's fine. These days, Rust Go, things like that. We have a number of languages, or Python, where it’s, okay, they’re boundaries. They fall in.

I think a lot, most that comes down to, just what organization are you a part of? What is their big investment in? Because especially in the big organizations, it's not little single tiny projects. It is a collection of teams working for bigger goals, and there's all kinds of existing libraries, services and stuff to build on. Most likely, I expect most teams end up following the suit from where they are.

[0:13:40] SF: Okay. Going back to the actual compiler, can you walk me through the phases of the compilation process? What does it actually look like to compile something in C#?

[0:13:51] JP: Sure. The way the compiler works today is actually, if you run – most people's entry point to the actual command line compiler is MS build. You either run MS build .NET ability built for Visual Studio. Build is actually, I often tell people, build is a complicated process of which the compiler is this little tiny piece. We take source files and references. Under the hood, we actually have a server that we start, when you start a build, which we let every, essentially, compilation event go into this compiler server that we run in the background and handles all the requests. We do that for a lot of performance reasons.

The phases of the compiler are probably not too surprising for anyone who's taking a basic compiler's class. We have our initial parsing phases, we get our parsing trees. We then build up our initial semantic model, where we go through all of your top-level symbols. Like, let's bind the classes, their base types, look at all the methods signatures. Let's make sure we can assign that these are all real names. We have symbols for them. All of that stuff is actually done in parallel.

We do all of our parsing, for instance, we just basically parallel parse every file with your code base, then we start running through initial binding, getting those technical trees. Then we do what's called compiling method bodies, just going method by method, evaluating the syntax, binding, and admitting the IEL. In many ways, it's not too different than anything you would expect.

What I think is really interesting about our compiler is we think of the batch compiler as a secondary feature. The C# compiler is very much developed as a library. If you go and pull apart the actual cse.exe, it's about a thousand lines of code. It's not that much. All it is is basically, grabbing arguments and calling into our library. That library, if you look at it, we have a full fidelity syntax tree. You can ask the library to parse a file, you'll get back a beautiful syntax tree that if you take that syntax tree and two string it, you'll get byte-for-byte identical strings that you initially parsed, or very rigorous about that.

You will have what we call a semantic model, where you can basically say like, “Hey, I have this node in the tree. Can you tell me what symbol that is? Can you take that name string and give me a symbol? Let me see if that's the string for .NET?” Or some other serialization artifact over here. That API is very much used by things like in Visual Studio. The entire Visual Studio experience is built on top of a compiler as an API. That is very much the most interesting part of our compiler.

From that perspective, it doesn't really have phases. Yes, I'm understanding the IDE is largely going to do parsing on top of this. But then, they're going to grab random semantic models and start asking questions, because they have to do things like, colorize files. They need to know what type of token is this, what category is. They want to know, “Hey, is that thing a class? Is it a struct?” They ask us all kinds of questions. When they're generating IntelliSense, they want to say like, “Hey, what are the members for this symbol, so I can generate an IntelliSense list?”

Strangely, it's like, our first job is actually thinking of ourselves as a library and thinking really like, how can we help the IDE team build a very fluent and beautiful experience for C#? Batch compiler is in some ways, boring. It's very little code. We don't touch it as much after we write the IL. We spend a significant more time talking with the IDE team about like, how can we make our APIs expressive enough that you can build all the cool things you want to build in Visual Studio, VS code, and also just there are all kinds of other tools around Microsoft, where people host the compiler. They basically use it to run in all kinds of little – some tools and analysis.

Lastly, the compiler is also a plugin system these days. We allow customers to insert two types of plugins into the compiler. The first one is what we call an analyzer. Very often, people have domain specific rules for their code. If you have your own serialization framework, you will probably have rules about what types are allowed to be serialized, what type of constructors you might have, what type of members you cannot have, if you're going to be serialized. Those are errors we're never going to put in the C# compiler. We're not any diagnostics for your custom framework.

But via analyzers, you can add your own warnings. An analyzer is a plugin that goes in the compiler. As we're compiling, we start throwing symbols at these analyzers saying like, here's a new class, here's a new member. Those plugins can then start generating domain specific warnings about that. We have a lot of different tools out there which have these plugins. Like, [inaudible 0:18:22] has a plugin, an analyzer which will tell you when you've done your facts incorrectly, when you've done theories incorrectly. It'll help point out all these little bugs in your code. That is a very powerful tool that people have taken an enormous advantage of.

Lastly, the other thing we added about three releases ago was a very simple model, where we allow people to generate code as a plugin to the compiler. This is something that libraries very commonly do now. Libraries will have a generator paired with them that plugs into the compiler and will ahead of time, generate code that supports their library. A really good example of this is Regex and .NET. Regex is typically, you write a – new up a Regex object, you pass in a string. At runtime, the Regex has to parse it, it has to generate all the CFGs and all the models under the hood, and it has to do this very quickly and produce code that will then match those things very quickly.

Well, with the power with source generators now, what we actually do is just run that code at compile time. You basically define a method, you mark it as let the Regex generator run here and you put your Regex string on there. At compile time, the author of the Regex library has a generator that looks at the string, parses it out, and generates all the code at compile time, adds it into your program, and then at runtime, all that code is gone away. When you match a Regex, it's just straight into the code for matching a Regex. It's an enormous performance boost for items like that.

[0:19:55] SF: Yeah. Definitely. You talked about this design choice around making the compiler more like an API, essentially, and the impact that has around Visual Studio. Was that design choice primarily, essentially, influenced when that first came about by the relationship between Visual Studio and the compiler and trying to make a great dev experience?

[0:20:19] JP: A 100%. If you go back in time, the compiler you use today for C# is written in C# and generally referred to as Roslyn. The original compiler was written in C++. It was very much, if you looked at how Visual Studio was constructed, we had a compiler team and they went off and built the compiler and we had an IDE team that went off and built the Visual Studio experience. This was possible to have these two separate teams that each of them had their own parser. They all each had their separate binders and a couple of things.

For C# 1.0, that was fine. C# 1.0 was a very simple language, comparatively speaking. It's just classes, properties, methods, a little bit of load resolution, logic. Two teams could write their own little binders. No problem. Well in C# 2, you got generics. That means you get type inference. Things get a little bit harder now. The IDE teams had to write more and more code. Well, then around C# 3, we get LINQ. Language integrated queries. That is a very computationally expensive process. It's very hard to write all the binding rules for LINQ. This is where the IDE team started to get a bit of a breaking point.

Like, “Hey, we can't keep up with you all. We have to both replicate all the craziness you all are doing in the language and build a beautiful experience around it.” You can see at the time, the code base has started to merge. They started to share a parser. They started to share a lot, the binding operations, and they started trying to get closer and closer as one code base. That was how things went along for about the seven to 10 years. The problem is, you end up having, like you were trying to retrofit this compiler, which was a batch compiler, and very specific memory allocation patterns that are just like, hey, let's batch compile as fast as possible.

Then you had this IDE team, which is like, “Hey, we exist in Visual Studio.” Objects are very long lived. Also, at the same time, Visual Studio was moving from a tool that was written in C++, to one that is very much written in C#. We introduced WPF as the main UI for Visual Studio. You have the IDE team getting dragged in the other direction between like, “Hey, we're getting pushed to more and more managed code, more C#, but our engine is this C++ code base that was really never designed to be called this way.” You had this big tension within our own team. Then also, when you looked around Microsoft, we found the exact same tension. There were many people around Microsoft who were like, we want to analyze our C#, and we want to do things with it. We want to format it. We want to look at it for bugs. Probably, the most famous tool for this was StyleCop. StyleCop was not a person who came by and used a C# compiler. They couldn't. There was a C++ code base that was not re-hostable.

They rewrote, basically, a C# parser analysis engine in C#. That was not unique. We found, I think, something like, 10 to 20 different C# parsers and binders in the company of varying degrees of accuracy. There was a tipping point that got hit around the Visual Studio 2010 release. A little bit before that, where Visual Studio had gone so far into managed code, and so far into C#, the C# language was having problems keeping up with Visual Studio and getting the experiences we wanted out of it. That's when they said, “It's time to rethink how we're approaching this problem. We need to think about compilers as a surface to the IDE, to the batch compiler. We want to rethink this and we want to restart this process.”

That's when a portion team forked off for a number of years to re-engineer the compiler as a library approach and then rebuild our IDE services and batch services on top of it. It was a pretty big bet back then. It took about five to six years to come to fruition, but it was definitely very much an inflection point in the development of the language.

[0:24:05] SF: Yeah, it makes a lot of sense that you'd end up with this siloed parsers all over the place, because there's lots of places in different types of products, like dev tools and stuff like that, where it's valuable to be able to parse the language, or show some representation of the language, or even do something like, IntelliJ as part of the dev tool that – then you end up, basically, recreating certain parts of the parser. The design choice makes a ton of sense. I can really see why you essentially moved in that direction.

Actually, a project I worked on when I was in graduate school was this project called TagZ. I tried to build it into Visual Studio, where the idea was, it’s like a new take on commenting and also, being able to jump through code where you would tag different parts of the code, you could tag something as a bug and have some references and stuff like that. We would have to parse the code, essentially, to build this tag hierarchy structure and stuff like that as a plugin. It was a lot of work to try to do it in Visual Studio C# at the time, because we end up, basically, having to resolve that problem of like, how do we actually parse the code and get something intelligent and pull up the pieces that we need? It would have been great to basically, hit an API endpoint to do most of the work for us.

[0:25:16] JP: Yeah, and that's something we see over and over and over again. What's really cool is we initially designed this for like, hey, we want to make Visual Studio great, help ourselves. There's some internal teams that we know are trying to do these things. But to see the things people have built on top of it that we just – it's like, when you put an API out there and it's nice, like what do people build on top of it? We ended up, like there's this tool called SharpLab.io, which is a web browser that where you can just start typing C#, and it will compile it for you, it will show you the IL, it will show you the parse trees, it will show you – it'll even let you run code. It's like, we didn't build this.

This very nice guy, Ashmine built this. It's one of these things where they built it, and it's now one of the most important tools for the C# compiler team. We're in C# language design using this tool that someone built on top of our APIs. We're like, “Hey, wait a minute. How does that rule work in overload resolution?” We just go SharpLab it and put it up in language design and start showing it. It's really interesting to see all the different things people built once we give them the tools to build it.

[0:26:13] SF: That's cool. What's the process of implementing a new feature in the compiler and how is that different maybe than implementing a feature in a classic application that's maybe a SaaS app, or a consumer application, or something like that?

[0:26:29] JP: The actual process, the way we approach this is we do pretty much all of our features in a feature branch. We just throw a new branch up on GitHub and we start working in there. Our process is very much like our standard dev process. We've tried different, couple things over the years, but what we have found works best for us is we follow the exact same rules for building a feature that we follow for building a bug fix.

Every change you make into a feature is going to have two people sign off on it. You're going to be passing all of the tests. The only leeway we give the feature development is we acknowledge that you can't do a feature in NPR. You have to break them up to discrete commits. We have this very specific formats for comments. We call it a prototype comment. I didn't solve generics. I'll come back here later. We allow those type of things, because we found it was to burn some of the file issues for the hundreds of things we know we have to fix in the feature.

Outside of that, we spend a lot of time getting the compiler online getting our APIs in order. As we're getting to the end of that, we actually start bringing in the C# IDE team. Because one thing about C# and this is something I believe about all languages, is languages that just have compilers are toys. Languages are defined by the ecosystem of tools that support them. If we come out with a feature in C# that doesn't work in Visual Studio, we will get zero feedback. No one is going to help us understand if we've done the right things, or the wrong things.

When we ship a feature, we try to ship the experience. As we get far enough in our feature branch that we're done with the feature, we bring in the IDE team, they help us round out the corners. We make sure the base experience is in place. Maybe not all the bells and whistles, maybe not new refactorings, new code fixes, but you can use this feature in Visual Studio. We then put it back in the main and then we basically start moving it through the Visual Studio preview process. Customers will get a new Visual Studio preview. If you ever want to crowd a new C# feature, you just go and set your language version to preview and suddenly, you'll just start lighting up in Visual Studio. You don't have to do any new installs, no side packages. It's just right there for you to play with. That's the basic process by which we ship the feature.

Because they'll stay in preview from anywhere from nine months to a month. A month to nine months, depending on when the annual cycle we ship something. Then as we move to a new version of .NET, which usually ships around November, we will graduate features on a preview and they'll just become a part of the experience.

In terms of how it differs from other features and shipping, I think a lot of it is just that C# impacts all C# developers. We change the compiler, we're not impacting the subset of customers who use our product. The subset of our customers is all C# developers, and so our impact can be huge. We are very deliberate with the order in which we ship features. Some features we go into and we're like, we know what the answers are for this feature. This feature is just about us committing to get the work done. For those features like, quality is our only concern, we have a lot of processes by which we feel pretty good about quality. Customers will always surprise us and give us strange things, but we feel good about that.

We have other features for which we go into them and we're like, we think we know what we're doing. We think we know what the right answer to this is, but we always find using flexion points and features and be like, it depends on if more customers have thought left or right about this problem. Features like that, we are very deliberate in that we try to ship them very early in the release. Because we want to give people as much time as possible, play with them, give us feedback, tell us if we made the right, left and right decisions here, so that we can adjust course before we actually hit our TM, fix them.

The other problem which might be different from other things is when we ship something, we're going to be maintaining that forever. These are forever decisions. That is daunting at times. We do have some leeway. When we ship a feature, I often tell customers, for the six months, first six months, I can change anything I want. I can break changes – there are bugs, they're just, oh, oh, we made a slight design decision here. We're going to fix that.

Once you get to a feature is meant in the wild for nine months to a year, at some point, it becomes permanent. It is a very daunting thing to think about when we ship like, “Did we make the right decision? Because we have to maintain this forever.”

[0:30:39] SF: What about the actual release cycles? How has that changed over time?

[0:30:44] JP: When I first started at Microsoft, our release cycles were somewhere between two to five years. Depending on which product cycle you're on, where you were, there were some longer releases and shorter releases. I remember back then when you go into release, when you get into that ship date, and we also had not a lot of update mechanisms. It's like, we have security updates and reliability updates only. No feature patches in the middle. When you're getting to that ship or no ship decision, and you're like, “Oh, God. If I can't convince my manager to ship my feature, customers might not see it for four years.” The intensity in those discussions would be quite high, because this is real things.

As we move to .NET core model now though, as things have gotten better and better over time, but now with .NET core, we ship annually. We ship like our team release pretty much every November of .NET core. In terms of Visual Studio, Visual Studio ships about four times a year. What we do is basically, as soon as possible, after we’ve [inaudible 0:31:41] the new version of .NET, we immediately start working on the next one. We try to very much get our features into Visual Studio as fast as possible to preview them for the next version of .NET. We're shipping all the time.

I remember, it's so funny, because I have a lot of people who report to me who are fairly new. I remember one of them getting very upset like, “Oh, if my feature doesn't make it here, it's going to be one year before customers see it.” I'm like, “No, it's going to be three months. You get to ship yours and preview in a few weeks. I used to have to wait five years when I missed my ship window.” It's the context of different eras.

[0:32:16] SF: Yeah, absolutely. I mean, I'm sure if you went back to old days of working operating systems, the release cycle is really massive. It's a whole other world. You mentioned this, having this six-month window, where you might be able to make whatever changes that you want. Is there an example of a feature, or choice that was made that turned out, it was the wrong choice, a mistake that you had to actually come back on?

[0:32:43] JP: I’m trying to think. I mean, we have these all the time. They're usually in more of a side channel that impacts very little people. I mean, I'll give you a good example, like we just released collection expressions, which is a feature in C# and .NET 8. It's basically, a syntax that serves two very good purposes for C#. One, it is a very concise syntax for creating collections. Like, someone who is more familiar with Python would be very happy with this type of syntax. It's very minimalistic for creating these types.

It's also one of the things we designed into it is the way those collections are created is we do it in the most efficient way possible. We will do all kinds of memory tricks. We will use the fastest runtime APIs we can take advantage of. We've designed the feature such that as the runtime gets cool, new ways of creating collections, we're free to just keep moving forward and making them faster and faster and faster.

We realized after shipping, we had a number of features we knew would build on top of that in .NET 9. We're doing this thing called PRAMS span of T. As we started getting into that work in C# 9, we realized we made a few decisions we regretted in there. For one, we regretted that we look for things like, when we're deciding how to build one of these collections, we have these things called builder types that we look to to build them. Decisions like, can they add method on the builder be an extension method or not? We decided that was a bad decision. That was causing a lot of complexity in the compiler.

There's a couple other esoteric decisions around how we treat the builders, where we look for them and what the relationships are. We basically decided after getting some customer feedback, starting to build new features on top of them and getting some bug reports that were like, those weren't the right decisions and we need to go in the other direction with that. We've spent about a month or two after shipping, resolving what the best way to address those problems were and those changes will come out in about 1710. They'll actually start shipping and preview in a couple weeks from now.

[0:34:34] SF: Okay. Then, how much does – what's normal, or expected in a particular era of programming languages, like impact decisions that you make about language design? 20 years ago, a lot of languages used a lot of braces, lots of using statements, for example. Then as different types of applications been built, there's all a focus on minimal languages now with Python and Go, and so forth. Most programmers are building for the web today, not the desktop, which I think also probably influences the style of the languages that were used. How much does that impact the decisions in the different eras of C#?

[0:35:11] JP: It very much impacts them. You almost have the perfect setup for this answer, too, which is great. We actually look at other languages and ecosystems a lot. Languages are a competitive space. I mean, I think, I say that in a fun, happy way, but it's like, we absolutely look at other languages and ecosystem and say, do new programmers have a better time going to that language, or they have better time coming to our language? We do absolutely look at things like that.

A great example was .NET 7, where at the start of .NET 7, if you took a hello world web application, just a standard, like put up a web page, basic thing. What we did was we basically said, let's put Node, Go and .NET side by side. Let's look at them from the standpoint of someone who wants to start programming, who wants to find out what language and framework am I investing in. We were not nearly as good. Node and Go were so much better than us, in terms of minimalistic style, having a single file, having a lack of ceremony associated with them. We made a very deliberate effort across both the .NET runtime, the ASP .NET surface area, and the C# language to say, we need to be in a better competitive space there. We want to be attractive to new programmers.

We took an entire release of the language. We made invasive changes to, like I mentioned, overload resolution, how lambdas work, which are very scary things for us, because those are people get very deeply ingrained in how those things work in ways that they don't understand. They're relying all these subtle rules that were about to change. The ASP .NET team changed a lot of their API surface area to be more flexible with this minimalistic API approach. We actually called it minimal APIs, or minimum APIs. I create the name sometimes, but it very much was a big theme for .NET 7 to get us over that hump and put us in a better competitive space.  That was, like I said, a very big theme of it. Then even since then, we’ve had all these little follow up things to say, okay, we miss this area. We can make this area better and we just keep pushing and pushing on that.

[0:37:12] SF: How do you balance staying relevant and also, attracting new developers to the language while not pissing off the base of people that have been using the language for years?

[0:37:25] JP: It can be tough. One thing we tell people, we do actually get a lot of feedback sometimes of like, “Hey, you invented a new way of doing things.” One of the pushbacks we get is like, there's three different ways for checking for null now. We always tell people, it's like, that's the cost of innovation. It's like, we very much made decisions put pattern matching into the language, because we felt it was such a powerful tool in the web space for dealing with requests and dealing with data. But it does, as a side effect, mean that you introduce new ways to do comparisons.

If you want to have a slightly different way of checking for null, you could do it. We do get pushback on that. Customers are like, “Ah, you're making the language more confusing. You're making it harder.” Our feedback to them is like, “You don't have to move forward. C# will keep compiling your old code and your old patterns forever. If you don't want to take the journey with us, there's nothing stopping you from using the patterns and tools that are familiar to you. We're not breaking them.”

We are telling people, if you want to embrace the new way of doing things, this is how we move forward. Often, some of the carrots are better code generation, better support, better integration with some of our tooling. It is a balance and we do our best to strike it. We're somewhat successful in it, I would say.

[0:38:35] SF: Has the growth and adoption of something like, JSON on the web influenced decisions that you made around the language? One of the reasons, I think, JavaScript and frameworks like Node caught on is there's just such a simple mapping between, I think, JSON and the language is just a direct mapping as the object structure. In languages that are type-based and have more complex object structures, it's a bigger lift, essentially. You end up parsing the JSON using a parsing library and then trying to map that back into a data structure, or something, and it's harder to work with.

[0:39:06] JP: It's definitely influenced us. There have been even pushes over the years to integrate those types. If you look at Visual Basic, Visual Basic added XML literals into their code back in 2005, I believe. Very much trying to remove that lift and shift. It's like, don't. The XML is right here and you can do exactly what you want to it. We have a structured API around it.

We've gotten pressures to do things like that in C#. Thus far, we've resisted them on the idea, we don't want to be tied to a data format that might – if the web shifts to a new format, that would obsolete that work in our language. Instead, what we do is we work really closely with the people who are writing these JSON serializers, who are writing the APIs that do that mapping between JSON and .NET objects, and ensuring that we have the features in place, so that they can be successful.

I mentioned, source generation earlier about we have the ability to generate source at compile time. We work very closely with a system text JSON team, so that they had enough flexibility in their generators to have very efficient mappings for parsing JSON into C#, minimal allocations, hyper, things like that. In terms of the actual putting JSON into the language, it's much more working with teams who are doing that to do what we can to make them successful.

[0:40:22] SF: What are some of the hard problems in compiler design? Is it overload resolution, I would think, would be something difficult?

[0:40:29] JP: I often tell people, it's like, overload resolution is easy. Implicit conversions are easy. If you have a language with both, it's hard mode. It's basically impossible mode. If you have implicit conversions and overload resolution, it just becomes impossible mode. That is definitely one of the harder problems. C# is very much a language that has implicit conversions and overload resolution and type inference. That is a very devilish problem, particularly the way we historically have inherited a few of those things. We spend an enormous amount of our time when we design new features and want them to be very fluid, we have to consider, how does it impact overload resolution? How does that change type inference?

One of the jokes I have in the team is our normal review rules for the compiler is two compiler devs have to sign off in every change that touches the compiler. If you touch overload resolution, all the devs have to sign off, all the principals, because it's at a very gnarly space. Outside of that, I would think, one of the hard problems is just the aesthetics of the language. I've been the compiler lead for seven years now. I've worked on the compiler teams for probably the last 15 years. I often tell people, it's like, I know how to make compiler. I know how to educate people in becoming a compiler dev. We're very good at taking in junior devs, introducing our code base, teaching them how to be a compiler engineer. It's an engineering problem. We're very good at teaching people the engineering skills.

People are like, how do you become a good language designer? My answer is like, you sit in the room for three or four years and watch people do it, and hopefully, you can become good, too. It's such an aesthetic thing. Getting the aesthetics of the language right, it's an art. It's very hard, and there are not that many people who are very good at it. Even myself, I've been here 20 years. I've done most of my work there. I submit a lot of designs that go into the C# language, particularly our performance areas.

Even me, I understand that my skills do not include pure aesthetics of the language. I often joke, I will submit some designs sometimes. Well, the aesthetics will be deliberately terrible. They're like, “Why'd you do that?” I'm like, “We all know, we're not taking my aesthetics.” It's like, what I've done here is define the rules for this thing, and a shape for what the future will look like. You guys are eventually going to pick the shape. I'm just here to help you understand what the rules need to be.

There's a couple of shapes that will be fine by me. It's understanding that. It's definitely like, there's such an aesthetic to that. Some people have a very natural gift for it. Some people can be taught, but it is definitely something that it's much more of an art sometimes.

[0:42:53] SF: Yeah. I mean, it's like, probably any form of design. If you're dining graphics, you can learn how to use a graphics tool. You can take a course. You'll probably be better and more competent than maybe I would be. I think the art side of it, the aesthetic piece, I don't know how you necessarily get there, whether it's an innate thing, or some training that probably happens outside of engineering anyway.

[0:43:17] JP: Yeah. I've gotten to point now, I'm very comfortable saying, I think that's a bad aesthetic for the language. When I'm like, well, I'm not sure if that's good, like good, is that great? We'll let other people decide.

[0:43:27] SF: Then, how has the move to open source impacted what you do?

[0:43:32] JP: Well, enormously on so many different levels. I mean, you have – even like, there's the base engineering problems. When we went open source on GitHub, initially code plaques, but GitHub was really a place we became an open-source tool. It's like, you're shifting a culture that has been non-open-source for its entire life. Getting that engineering system. Just building in GitHub, building in the open without machines is in itself a huge challenge. I mean, Roslyn, when we started that compiler, there was a lot of thought that by the time we finished that, Microsoft would be in a place that was more minimal to having open-source projects. They tried to develop with the, we're eventually going to open source this in mind.

Even then, I was a part of the engineering team that took us to GitHub, months and months of work to just undo things. Your build system has all these subtle dependencies on internal systems, ripping those out, finding code, which is not open source friendly that you have to get out of your system. It's like, no. Can't ship that library. That's not an open-source friendly library. We have to lift and shift over to here, just your build systems. Moving from a place where you have these beautiful internal build systems that can talk to all these services to say, “Hey, we're on GitHub now. We can't talk to any of that.” This is all building in the open. We have to now find replacements for all these tools, these libraries, these services.

It was an enormous engineering shift. I would say, even probably, we didn't really figure out the engineering shift a couple of years until we got good at building out in the open and dealing with these dual builds. We're building in the open, but we also build internally for shipping purposes.

[0:45:07] SF: Has it made the product better?

[0:45:09] JP: Oh, absolutely. One of the things we take a lot of pride in now is you can just go to our repo, clone and build. Clone, build, F5. I tell people, it was like, and we listen to Raza, open a Visual Studio, hit F5. A new version of Visual Studio pop up with a C# compiler, and you can just start playing and having fun. There was zero chance that would have been true before we went open source. When you joined the team at Microsoft, the first thing you did, the first commit you make in a new team repository at Microsoft is you find the readme file in how to build the code. What services you must do? What security groups you got to be in? Then you find the errors in it from what has changed since the last person joined the team? You send the commit. You send the commit to say, “Here's the new way to build our repo.”

That type of system works when you're working with 10 people, who can come sit in your desk and be like, “Oh, yeah. We forgot that you joined this group.” Now, that does not scale the open source. Open source, you have to make it. If you want contributions and you want people to participate, the entry point to building and running your code has to be as small as possible. We very deliberately went to the open, put a lot of emphasis on how do we make this easy? How do we make it the simplest possible? Clone, open Visual Studio, go. It is definitely a better product in terms of that respect.

[0:46:21] SF: Yeah, it becomes a forcing function probably for actually, writing good documentation, too, because you can't just tap your buddy on the shoulder and ask the question.

[0:46:30] JP: Yeah, it absolutely is. I remember when we first, for a while, we went open source, people would start saying, this PRs are awesome. I have this new language feature and I want to merge it. We're like, “Stop. What are you doing? Why are you doing this?” We're like, “This is not how you do a language feature. You have to go to the language design committee, get it approved. You can't do a feature in a PR. It's way too big. We have to do the IDE. We have to do ENC. Stop.” They're like, “We were never told this.”

It's just this knowledge we had in our head of like, you can't do that. We're like, “You know, you’re right. We've never told people how to do this.” We actually sat down and put a document out like, “Hey, you want to submit a language feature? Here's the process. This is what we do. Please, also, don't start with this. Please send some bug fixes so we know who you are and that you're worth.” Because one thing we had to really communicate to people, it's like, there's no such things as a free PR. We have to review your PR. We have to maintain the code.

There's an enormous resource investment on our part to even get your language feature in. We don't want to commit to doing that to just someone who says, they have a good idea. We want to see people who've done bug fixes, maybe some refactorings that we'd like, okay, you're going to be here for the long haul. You've displayed a minimum amount of competence here. We feel like investing in you is not as big of a risk. We feel like, it's probably going to pay off now.

[0:47:45] SF: You've been working on C# for 20 years at Microsoft as we said at the head of the show. What keeps you interested and engaged?

[0:47:54] JP: I don't know. I tell people that I've worked here for 20 years. I've had about five different jobs. I originally started a different part of Visual Studio. Then I moved to – I actually started the Visual Basic team. I worked on the compiler for a while. Then I went to bug earth, and worked in research for a little while on systems, taking C# in a systems level direction, doing permissions, async and weight, a bunch of other things. Then coming back. I came back to the .NET team about seven or eight years ago and took over the C# compiler lead.

I've had a number of different roles over the years. I think the thing that keeps me interested is like, there's just so many different problems to solve. Yes, my main job is usually the C# compiler lead. I also do things, like I help out with our infrastructure, because we have an enormous infrastructure for .NET. The scale of testing we have, the scale of our builds, problems, our security challenges. We're always constantly changing what we're working on.

We used to be, a couple of years ago, it was office programming. Then we're having to learn about that to help solve those problems. We have to learn about ML for a while to help solve some ML problems. We had a big performance push, which helped me learn a lot about how performance work. What matters? How can I help these guys with C#, so that they can write better performant code? Now, we're starting to learn about things like AI. The nice part moving about in the language of spaces, I get introduced to all these different problems. I can get these little tidbits of fun that I can pull into my day, learn a little bit, and move on to get the next little tidbit.

Also, the .NET team is just fun. I often joke to people, it's like, people will come to DevDiv, but they don't leave. Because DevDiv, we have fun problems. We have a really good culture. A lot of my out of work friends, I met at Microsoft on our teams. We had enough fun working together that we started hanging out afterwards, too. It's just a really fun team to be on.

[0:49:47] SF: Yeah. I was thinking like, working on a language that's been around for this long and been relevant as well. It's evolving as the industry evolves, too, like some of these things that we've talked about. different eras that align with what is relevant in the industry at any given point. That's changing all the time as well, so there's always new problems to solve.

[0:50:04] JP: Yeah, absolutely. That is definitely part of the appeal to it, is it is always changing. The environment around us is always changing. It's not the same old job every day. It's like a new facet to your job every year or so.

[0:50:16] SF: Awesome. As we start to wrap up, what's next for C#, and is there anything else you'd like to share?

[0:50:23] JP: What’s next for C# is, I think, we're looking forward to .NET 9 upcoming. We've got a number of performance features we're working on. A big area of performance for C# is what we call ref structs and span of T. That's our basic performance provenance. In this release, we have a number of features where we have been establishing the building blocks for these features over the last few releases. In this upcoming release, we're probably getting PRAM span of T, which will give us a zero alec overhead variable argument calling format in C#, which is really exciting.

We're also changing language, so that we were likely changing language, so we can start passing these ref structs as generic arguments and have them implement interfaces and make them a lot easier to program with in our own lower level parts of the framework. On the productivity side of C#, we have this project we've been incubating for a couple of years now. It's called either shapes, or roles, depending on when you started paying attention to that, but will be likely released under the moniker of implicit and explicit extensions. Like I said, this has been an incubation for about two years now, and we're likely going to be shipping the first slice of that puzzle in .NET 9. I'm really excited about that work upcoming.

[0:51:34] SF: Awesome. Well, Jared, thanks so much for being here. You've had such an interesting insight into the history of everything that's happened with C# in there, almost from the beginning. I feel like, I could talk for another hour on this. I want to thank you so much for taking time and being here. Cheers.

[0:51:50] JP: Thanks so much for having me. It's been a great time.

[END]