EPISODE 1689

[INTRO]

[0:00:00] ANNOUNCER: The US government recently released a report calling on the technical community to proactively reduce the attack surface area of software infrastructure. The report emphasized memory safety vulnerabilities which affect how memory can be accessed, written, allocated, or deallocated. The report cites this class of vulnerability is a common theme in some of the most infamous cyber events such as the Morris Worm of 1988, the Heartbleed vulnerability in 2014, and the BLASTPAST exploit of 2023.

Herb Sutter works at Microsoft and chairs the ISO C++ Standards Committee. He joins the show to talk about C++ safety. This episode of Software Engineering Daily is hosted by Jordi Mon Companys. Check the show notes for more information on Jordi's work and where to find him.

[EPISPODE]

[0:01:01] JMC: Hi, Herb. Welcome to Software Engineering Daily.

[0:01:02] HS: Thanks for having me.

[0:01:05] JMC: So, why don't you introduce yourself?

[0:01:07] HS: I'm Herb Sutter. I've done software for a living for a long time now. A lot of the C++ and I've done writing in C++ and concurrency topics a lot. I chair the C++ Committee, which mainly is an administrative role, as the chief cash herder and bottle washer, and pick when and where the meetings are going to be. It's a lot of a job. So yes, it's a fun world and I've enjoyed working on a lot of projects in that world for many years.

[0:01:32] JMC: So, I guess the question is that the environment has changed a bit, at least the mood around parts of the industry with regards to memory and save languages, and we can talk about that term in particular. But C and C++ and others, because organizations like NIST and the US NSA, SISA, in general, the recommendations that come from the American government, but other governments here in the Europe, et cetera, are pushing and increasingly, so I would argue, the language has become more and more explicit, to nudge the industry to move away from C++.

So, in that context, could you actually try to describe what this is about? How you see it from the C++ community?

[0:02:15] HS: Sure. I wrote an essay about this recently, just a few weeks ago, and I've been talking about it quite a bit for the past, at least a year and a half, and a couple of years, where this has really been a pain point. This discussion has been going back to the nineties, right? But especially in the last couple of years, it's because our software is under attack. Your software is under attack. My software is under attack and no matter what language it's written in. Of course, one of the things attackers do is they go after the slowest animal in the herd. That's what predators do.

So, there has been an especial focus on vulnerabilities in languages that give you pure memory safety guarantees, as notably C and C++. But it makes no mistake. All software is being attacked. So, as you see the things get hardened, and we shore up one part of our software ecosystem, attackers move to another part. For instance, in the last couple of years, the amount of malware use that attackers have been using has been going down. Definitely, still there. We still have to do something about it. But they've been shifting to supply chain attacks to expose secrets to endpoints that haven't been secured, because they use the default authentication, things like that.

It's really, really important as native languages C and C++, not to have our heads in the sand and say, "Oh, well, we've been hearing this for years. All as well." No, it's not. We have work to do. But it's also important not to go to the other extreme, and think that, "Oh, if we just magically wave a wand and make all the world's software, suddenly convert overnight to memory-safe languages", which would be great if it can be done. It's not technically feasible. But even if we could do that, we're not going to make most of the attacks go away. Even the existing attacks already happening and even if the ones that we do make go away, the attackers will simply attack something else. We need to really do a lot of work in the software industry all up. That includes C and C++. Do not get me wrong. I'm not trying to minimize that at all.

[0:04:19] JMC: This question is going to be obvious to you and any C++ and C practitioners and programmers. But could you explain why it is a feature of C++ that the language is not going to give away to allocate memory to give that possibility to the programmers, right? In other words, C++ is not going to become a memory save language in anytime soon or ever. So, the reasons why that is actually a decision from the standards committee.

[0:04:47] HS: Yes. There's a nuance there. First of all, you're exactly right that C++ is about performance and control. The idea that I can have great abstractions more than C. I can write classes and templates and generic code and things like But with very few exceptions, I don't pay performance size, and space, and time overhead, unless I use a feature. When I do use a feature, it costs me as much as if I had written it by hand at a lower level.

Having said that, the default in C++ is to trust the programmer. We're going to give you all these sharp knives and we trust that you are a well-trained chef and we'll know when and how to use them. By the way, here's some band-aids in the drawer for the times that all chefs make mistakes, sometimes. One of the problems with that metaphor is that when you're under attack, that calculus could change. In particular, one of the things that we've learned is, as we've already been adding safer features to C++, it is a problem that they're not the default, right, those sharp dimes that we absolutely will always have, are littered all over the counters and all over the floor, where you accidentally step on to replace yourself.

What changing the default to me means is putting them in the drawer. So, we're not going to give up any knife we have. We want all the power and control we want. But there are many places where we can make that be safe, by default, have the door closed by default, and we opt in two using the sharp tool. Now, that is a place where C++ and C need to change. Because the default has been performance by default safety, you have to work for it. But there's no inherent tension necessarily in providing a mode of C++ and that is something that the standards committee is working on with profiles, for example, is to have a mode of C++, where we actually have mostly already well-known type safety rules that we've been teaching for a long time. Please use a unique pointer and not malloc. We have that. But today, they both compiled by default. With a profile enabled, you would only get the safe one by default. And if you want it to use the lower-level tool, you can still do it. But you would have to opt-out and ask for it. Which actually an approach that works well in memory-safe languages like Rust, and C# and others, where you opt out to get on safety because you want an unsafe coach. You really do sometimes for performance and control, but it's going to be work to change the default.

So, we're not going to take away any of the sharp tools, but it would be nice to put them in a drawer and change the default. That's a good way to think about the job ahead of us in the near term.

[0:07:25] JMC: Yes. I think the way I would put it is that you're changing a bit the default development experience of the language in a way. An opinion either way of offering the default. But those are my words. So, give us a sense of what the article contains. What motivated you to write it? If you wrote it yourself and just what's in it? What are the main ideas?

[0:07:43] HS: Yes. I had lots of good reviewers. So, thank you, [inaudible 0:07:47 Name], all people from across the industry, not just C++ folks. But yes, all the words are my own, so all the errors are mine, and I'm not pretending to speak for anybody. I'm not speaking for the ISO committee. I'm not speaking for my employer. I'm not speaking for my reviewers. Many of them may disagree with parts of what I wrote.

But the reason I wrote it is because it's important to do two things to acknowledge that there is a problem and how we can actually make progress on C and C++. I hope that throughout, it's very clear that yes, we have a problem. We need to have more memory safety by default, especially as you already, I think, summarized type downs, initialization, and lifetime safety. We need to make improvements on those four particular ones in the near term, and then others too, concurrency safety, overflow, things like that. Some of which relate to bounce and some of the other first four.

So yes, there's work to do and shine a light on that, and also with some reason for why there's hope and things that we can do and should do. But I also wanted to write it because there's a lot of misinformation or misunderstanding out there. For example, you see a 70% number a lot, whereas I say, in the article, I completely agree that it's a repeatable result, that 70% of memory safety vulnerabilities in C and C++ code wouldn't have existed in equivalently written say, Rust code, or C# code, and memory safe language.

But that's a memory safety problem. So, while fully acknowledging that, and I've used that number of talks myself and said, "Look, we got to do something about this and get parity to other languages, which isn't to perfection. It's to parity. To bring those bug counts more down to the noise and in line with other languages. But please, let's not over-index on memory safety, because nobody who's serious, I believe, really thinks that if we converted all the world's software to memory-safe languages overnight, wave a magic wand that we would have 70% fewer vulnerabilities, because that 70% is of memory safety vulnerabilities, but those are only three of the top 10 weakness, kind of CWE weakness categories.

They're still a minority of all the other ones out there. The reason I want to emphasize that is not - and this is important. Not to be defensive about C++. This is what about-ism, where like, "Oh, yes, I have problems." But what about them? They let their kids stay out all night. That's not it. What about isn't this changing the subject. That's not what this is. What I'm trying to do is to say, "Absolutely, this is a problem that C and C++ have to look at, that our software is under attack. Look at all these other attack vectors that we also need to do." 

So, it's not, "Hey, look at these other things instead of C and C++." it's in addition to the work we have to do at our part of the world. Have you seen what nation-states are doing? And the evil that's being done in the world? I mean, we haven't used the word cyber war very much in the present tense. Usually, it's like, what if there's ever a cyber war? What we are experiencing these days is, to me, already an active hot cyber war, and including with criminal organizations and nation-states involved. So, having said that, because I'm not a geopolitical analyst, that's my personal impression, but the house is on fire and we need to put out the fire in all parts of it, not just in the one room that's labeled type memory safety. But absolutely, we need to use it with putting easily extinguishers in that room, too. So, let's look at the whole problem because attackers will move to different attack areas, and we need to harden all of the software we have.

[0:11:19] JMC: Oh, you're absolutely right. We're recording this in a week at the beginning of which XZ Vulnerability was released. This was clearly a work of a group or at least a brilliant individual that was sponsored and had years of preparation to convince a burnt-up maintainer of this arcane in my view, compression algorithm, widely used, by the way. Arcane in the sense that it's been there forever, and changes have not been applied to it, because it doesn't need that much. But this person, this group of people took it. But anyway, so you're completely right. I interviewed Mikko Hyppönen here years ago, and both agreed he's an expert hacker, from Finland. And we both agree that we are technically in war, like nation-states are attacking each other. It's asymmetric. Some nation states are attacking more others than the other way around. But it's constant.

[0:12:13] HS: They're not attacking just each other. They're attacking our companies, our software, our infrastructure. So, many of the folks listening to this podcast likely work for tech companies. Your tech company is under attacked by those same nation-states right now, whether you know it or not. They simply are. If you build software into third-party ISD, you work on some popular software, unless it's something like a game and even there, I would start getting concerned. But the games generally want full performance and are just concerned about safety. So, some domains are not as concerned yet. Maybe they should be, but hopefully, less so that if you're writing, say Acrobat. But if you're writing some, not to pick on Adobe, but if you're writing some widely used piece of software, just assume you're under attack.

I remember having this discussion with a large ISV, got to be 15 years ago, 20 years ago now, where safety was already a concern. I remember them saying, "Well, but our software isn't being attacked. I know you guys have to harden windows, blah, blah, blah." I'm like, "Just wait until there's an attack on reader." And, of course, it wasn't very long after that if you're popular, you will be attacked. So, this affects all of us. If you're not in the software world, unfortunately, your life depends on software. Your bank does. Your power station does. All of our infrastructure does. So, we need to do something to harden it, that's a clear call to action for all of us as an industry.

[0:13:36] JMC: So, going back to the article, which by the way, we haven't mentioned, it's called C++ safety in context. It can be found in Here's blog, just type C++ safety in context, and you will preview first result. You mentioned that around 90% of the reduction in vulnerabilities by type, related to type bound, initialization, lifetime, et cetera, can be tackled with a few specific measures. I wonder if you could elaborate on those. Which ones are you referring to and so forth?

[0:14:04] HS: Yes. One of the reasons the article was, as long as it was, was because the last 40% of its or so was one big appendix on, and here's a detailed list of things we can do. The majority of which were things we already know and teach. But going back to earlier in the conversation that are off by default today. So today, you might or might not have a static analyzer that tells you about that guideline. Or you might or might not remember that guideline from when you read it in the book. Or you might need to download a third-party tool like GSL span to get bounced checking for ranges instead of using pointer arithmetic.

We need to do better at packaging those up so that they're easy to acquire, easy to adopt, and easy to turn on by default, ideally at build time, so that it's not some separate post-build post checking tool, but it's running all the time right on the developer's machine, pre-check-in. I remember seeing just a few months ago, I can't remember right off the top of my head, the name of the security person I was talking to. But the point he was making is that modern C++ code that uses all the guidance, which isn't far from everything, right? But modern C++ code that uses all the guidance of the tools is approximately equivalently saved to Rust. Yes, there's some things each do better, but roughly to rough approximation, equivalently say it. But one of the big things you get from Rust, which makes all the difference is theirs are on by default, and run as part of compilation before checking. That really, really matters. That's an absolute advantage.

So, that's the kind of reason why I say that we already have a bunch of guidance. There are still a few holes to fill like particularly about bounds checking, and enforcing bounds checking. But we largely have the guidance. But now it's forced through third-party static analysis tools that you might turn on or use third-party libraries you might have or a compiler setting. You have to remember the 10 compilers which is to set. We need to bundle them better and make them easy to turn on by default, and too often to a safe mode of C++. So, that's where I think we have an opportunity, but also a bunch of work ahead of us to build that. But I'm optimistic because it's mostly about assembling things we have, and packaging and delivering them in a better way that makes it easier for people to turn on by default. And none of that requires breaking backward compatibility, which is a huge important thing to us.

[0:16:29] HS: Yes, exactly. I know that most programming languages, if not all, strive for backwards compatibility. I'm not sure if they actually achieve it. But that's one of the tenets of C++. Another one is that actually probably makes the language unique is the way in which is it's managed, right? Through the ISO Standards Committee.

So, I know you don't speak on behalf of it. But could you give us an idea of how the language evolves through these committees and working groups? And how can the committee, the standards committee help implement because turning things by default is something that you do in product, but you can't do that with downloading the different libraries from the language itself, in this case, C++. So, how do you guys do to enforce or nudge everyone to use these measures by default, and also, if you can give us a sense of every committee is - because the language is released every three years, but I have the impression that things have accelerated the pace of innovation, I guess, in a way has accelerated lately in the past few years in the C++ community. That's my impression.

[0:17:35] JMC: Yes. So, let me tease two parts. Basically, how does the committee work? We have a bunch of subgroups that are interested in specific areas and work on specific domains and proposals. But those feed into a language evolution and a standard library evolution, main design groups that are responsible for the main design of the language, and it's standard library. Both of those are part of the standard. But part is in the compiler and parts as C++ code that you get stood vector array, things like that. Then, that gets approved into the standard. There's more steps. But that's an overview.

One of the domain-specific groups that I created not long ago was the safety and security group. There had been an advisory group that we promoted into a full group and they've been working on this. In particular, I mentioned profiles is the current direction that the administer and others are promoting, and that subgroup SG-23, on safety, as adopted as their plan of record. There's been a lot of confusion about, well, what are profiles? Profiles are just a way to label a group.

If you know, warning families, like you say like /w all, like w all is a common term for all common warnings. That's an easy way to have to opt into hundreds of different warnings on C++ compilers. Think of that in terms of safety rules, is the kind of idea that's behind profile. So, just say, I want this translation introduced piece of code to be compiled with type safety rules on. That means the pointer arithmetic may not compile or whatever rules that we put in to that safety profile, which can also evolve over time and get stricter and stricter.

So, it's a mechanism to enable people to opt into those rules. That's the general direction that's currently being explored and it's something that requires some development because there are concrete implementations of it. There's experimentation going on, in terms of how to deliver those profiles. But the actual rules, as I've mentioned before, are mostly well-known. There are a few gaps we need to fill like around bounds checking. But the rules themselves are generally already well-known. So now, this is about labeling and packaging them so you can say, just like you can say, /w all, I want all common warnings. You could say something like /w safe. I want all the safety by default, while having also a finer granularity where some domains want different subsets of the rules, or cared more about some rules than others.

For example, life-critical software probably cares about all the safety rules because memory safety contributes both to software safety, which is about making software safe or unintended against unintended harm to humans, the environments, and so forth. And also, software security, against intrusion protecting secrets. So, memory safety enables both software safety and software security. But some domains may only care about software safety.

For life safety purposes, we may only care about in this particular application about avoiding unintended harm to humans. We may not actually care if the program crashes, or the attacker tries to exploit a vulnerability because it's just not on the Internet, or it's just not the attack surface just isn't there. That's one of the reasons why profiles is plural, because some domains may want subsets of safety. But I think there's also clearly the industry a need for having a general, maybe all of them by default, and then I can start opting out, which is what in fact, memory-safe languages like C# and Rust do.

[0:21:08] JMC: The example, the parallelism that I'm going to describe here is definitely not the best one. But for those of you familiar with the SPDX, the SBOM standard, they have also implemented profiles. This is typical from languages, descriptions of the world, if you wish that are incredibly comprehensive, like the standard C++ library, or the SPDX SBOM spec. They need to narrow down into domains, I guess, in a way the spec or the functionality of a language. I think there's some parallelism there. But in any event, you've already hinted to the future of the safety measures that the committee is proposing, discussing and proposing. Any of those apart from profiles that you see will be put forward that will enhance safety in the next year, six months? There's a release, by the way, this year. Correct me if I'm wrong, right?

[0:21:59] HS: Actually, C++ 26 is our next standard and that is about 12 months away from being feature finalized. Yes. We're on a three-year cycle '23, '26, '29 are the current big milestone. But yes, so it's been it's always ongoing. Evolution is always ongoing, as you mentioned. So again, profiles is about, okay, what can the C++ standards do to help with memory safety narrowly, which is the part that we can do something about a C++ code, including the C subset. I would like, again, to encourage us to remember other safeties beyond that because those scare me too. Again, not to change the subject away from, yes, we have work to do in C++. But please, let's not also forget doing all the things that like the XZ attacks are doing, attacking supply chain, attacking securely stored credentials, attacking things that - Log4j is a great example that are written - the software written in safe languages is being attacked too.

So, let's really end the call to action and one of the first things I emphasize is user sanitizer. Some people may be surprised to learn that Rust has sanitizers. That does not imply that they have a weak language, they have a great strong memory-safe language, but you still need to use your set of styles. If you're using Go, use Go as a sanitizer. That's a safe language with sanitizers. Use the tools that are available. Fuzzers are important. A sanitizer inputs is something we teach people in all languages. Don't store your secrets securely. It's something that's totally language-agnostic.

A good metric just to bubble up and focus on one specific example. The good metric of the things that we need to work on and focus on is really hitting the top 25 vulnerabilities, the CWEs, the common weakness areas that MITRE puts together. And there's some qualitative parts of the process of selecting what the what those are. But these are experts who've done a lot of thinking and said here are the top 25 areas. Of the top 10, only three are about memory safety. So yes, absolutely. Let's work on those. Two of those three are about balanced safety. So, absolutely, let's work on those in C++, and C, which are the languages where those things are weakest of the popular programming languages. 

But the other 7 of the top 10, my goodness, let's prioritize those as well. Those apply to all your Go code and Rust code and C# code as well. Let's make sure that we don't just harden one door, because you don't have a problem with securing a house. The well-known illustration, which a security person could probably say much more smoothly than I can, but I'll give it a go, is you're not going to armor plate your front door and harden it, and focus all your money and effort on making it three-inch thick steel and still leave the window beside it unguarded. I mean, you're going to put at least bars on the windows. It only matters, you can only harden your front door. It only makes sense to harden it as much to make it not the weakest link anymore, not the weakest quiet entry point. And if you can get a quiet entry point the other side of the house, you need to harden it too. You need to bring all those up to par. 

That's why I really emphasized in the article to wear the trap of perfection. Nobody should be trying to make a formally provable memory-safe language. I mean, if that happens to come out as an incidental result, wonderful. But that shouldn't be a goal, especially not at the cost of other engineering realities like backward compatibility, because none of the other safe languages are 100% safe either. Even besides you can write a bug in any language. I mean, C# has strict free use after dispose issues, as well as fewer than C++, but it still have use before initialized. Still a thing. But less than C++, but it still has them.

So, the goal should be parity. We want it to get to a level of safety. But we're the perfect being, the enemy of the good. Because in striving to get that last 2%, or whatever, you should be taking all that money and putting it in securing the other doors and windows to let's make sure that we don't focus on just one thing, and then declare victory on our aircraft carrier and say, "Mission accomplished. We've accomplished one mission." There are multiple missions that need accomplishing, because attackers keep moving to the next lowest animal in the herd.

[0:26:18] JMC: Yes. That CVs system that you mentioned has a scoring method, several. But I think the most accepted one is the SCSS, which I never remember what it stands for. But it's a scoring mechanism system to score the level of threat, I guess, of each one of the CVs and it will be great to attack all the top 10, the top - I work for a company that is really concerned. T's called Chainguard. It's really concerned about the state of the National Vulnerability Database. This is a different topic, but I was summoned, the leaders of my company was summoned to the Congress just this very week to talk about cross-industry, cross-company collaboration to improve the state of the current management of vulnerabilities by the NVD.

I wonder, this is my question, if the programming communities have ever reached out to each other of the, in case the C++ Standards Committee to others, although they're not managing the same way. But still they're steering committees here and there to do this collaboratively. Has such a thing occurred? Has the government or any of the agencies that I've mentioned in the beginning, maybe in the US, maybe in Europe, have reached out? Are there I guess, any grassroots movements here happening or any top-down involvement?

[0:27:30] HS: More needs to be done, but some is happening. So, the C++ committee members have written responses into the government increase and requests for data, and request for submissions. The C++ Standards Committee membership already overlaps with say, other industry bodies like MISRA, which also, in their case, mostly for the automotive industry, published standards that are more about life safety, but many of those issues are memory safety issues, right? With the software memory safety as a means of getting software safety and life safety that this C++ process requires.

[0:28:03] JMC: So, Herb I think it should be highlighted. You've already made it clear that the actual bottom line, the actual final goal of the article that you wrote and can be found in your blog post, a call to action to collaborate on these issues and tackle them together more so than actually highlighting the context of safety in C++ given a different color than the outright negation of any safety at all in C++, which is the meat of it. But the final bottom line is a call to action which I actually support and wish you the best for it. So, thanks so much for being with us. If anyone wants to reach out to you about the article or any other aspect, where can they find you?

[0:28:44] HS: They can find me via my website. Yes, herbsutter.com. You'll find the article there. There might be one or two since then, but just search for safety or scroll down a little and you'll no doubt find it. 

[0:28:55] JMC: Well, thanks for joining us today. I wish you the best in the next release, in three years' time and in the all the work that goes in between and thanks for being with us, Herb.

[0:29:05] HS: Thank you for having me.

[END]