EPISODE 1602

[INTRODUCTION]

[0:00:00] ANNOUNCER: One of the most famous software exploits in recent years was the SolarWinds attack in 2020. In this attack, Russian hackers inserted malicious code into the SolarWinds' Orion system allowing them to infiltrate the systems of numerous corporations and government agencies, including the US executive branch, military and intelligence services. 

This was an example of a software supply chain attack, which exploits interdependencies within software ecosystems. Software supply chain security is a growing issue and is particularly important for companies that rely on large numbers of open-source dependencies. 

Michael Lieberman is the co-founder and CTO of Kusari and has an extensive background in software security from his time at Citibank, MUFG and Bridgewater. He's also active in the open-source and security communities, including the Open Source Security Foundation and the Cloud Native Computing Foundation. Michael joins the show today to talk about challenges and strategies in software supply chain security. 

Gregor Vand is a security-focused technologist and is the founder and CTO of Mailpass. Previously, Gregor was a CTO across cybersecurity, cyber insurance and general software engineering companies. He has been based in Asia Pacific for almost a decade and can be found via his profile at vand.hk. 

[INTERVIEW]

[0:01:36] GV: Hi, Michael. Welcome to Software Engineering Daily.

[0:01:39] ML: Hi. Thanks for having me.

[0:01:41] GV: Michael, you're the co-founder and CTO of Kusari. Can you give a brief introduction of what Kusari does? 

[0:01:50] ML: Sure. Kusari is a software supply chain security company, which pretty much just means we're involved in trying to help protect both the production and consumption of software. And we are very heavily involved in the open-source space right now. We do a lot of work within both the CNCF and the Open SSF. 

[0:02:11] GV: Fantastic. I'm curious, how do you find yourself co-founding Kusari? And what was that pain point that was clear to you that you needed to solve for developers? And I guess I think a lot of engineers out there would also be quite interested to hear what has been your journeys to get to this place. I believe you have a security engineering background. Could you maybe speak a bit more about how you've got here as well? 

[0:02:35] ML: Sure. Yeah, of course. My career started off actually kind of focusing a little bit on the automation space, which then eventually became sort of more of a DevOps sort of role. And then at a sort of midway through my career, I ended up at a hedge fund called Bridgewater Associates. Where, even though I was doing DevOps there, the focus was very much on security. And so, that became – before it was really coined as a term, but it was like I became sort of DevSecOps. 

And so, very much focused on – especially over there, the big focus was not just around developing software around investments and helping out with the automation of that whole process, but really around how do you secure that? Because the intellectual property was the code. The worry was some nation-state actor, an internal threat leaking that code. 

And that's actually kind of where my supply chain security journey had actually started. Because the concern there was somebody injecting something into the source code. Somebody injecting something into something upstream that can then be used to leak investment signals and those sorts of things to the outside world. 

And so, when I was there, we had built out a bunch of custom sort of software to help protect against those sorts of threats. Then I ended up at Mitsubishi Financial Group, MUFG. And over there, I was doing similar sorts of things where the big impetus over there was how do we get a bank ready for Kubernetes, ready for the modern world while still sort of protecting against these threats? It's like how do we increase the velocity of the features and the software that we're actually building for our customers and clients while still remaining secure? 

And so, obviously, third-party risk management and software supply chain security was a big sort of thing there. And while I was there, I started getting a bit more involved in the open-source space, which was interesting. A bank open-source. But it was early on where there was this big push to do more in the open-source space. And that's kind of how I kind of got involved with the CNCF. 

And then that led me into – eventually, I ended up at Citi helping co-lead some of the supply chain security efforts over there. And that's when I really got involved in a lot of the open source. Because the estimate is between 70 and 95% of all source code is actually open source. And so, if all of this source code is actually open source, you can't really protect your sort of company supply chain if you're not protecting the open source supply chain. 

And so, that's when I got involved with the CNCF TAG security, which is the Technical Advisory Group for the CNCF. Think of it as like cloud-native security, things like zero trust service mesh. That kind of a thing. And container security and that sort of stuff. 

And so, when I was there, I helped contribute to the CNCF Supply Chain Security Best Practices Guide as well as leading up what was referred to as the Secure Software Factory, which was how do you apply build security to stuff like containers and using all the latest and greatest to really secure the build from attack? Which is one of the core pieces of stuff like the SolarWinds attack that happened a few years ago. 

And then, throughout that, eventually ended up starting to work with the Open SSF a little bit. And then at the Open SSF, doing a lot of work on that front. One of the other you know folks who I'd had been working with since I was at Bridgewater, one of the co-founders, Tim, him and I were co-leading some of that stuff over at Citi. And you kept seeing the same sorts of things. And then we were also working with somebody else who was over at BoxBoat, IBM at the time, Parth. And the three of us kind of were like, "Hey, why don't we just start our own thing?" And so we decided to start a company focused on sort of not just solving it for one company, but trying to solve it for the broader community and industry as a whole.

[0:06:37] GV: That's really interesting. Would you say that a lot of the foundations, the practices of, say, DevSecOps, would you say that's come a lot from the financial industry? Or is it just so happened that that's an industry that kind of benefited early on from taking that more seriously? 

[0:06:54] ML: I think it's definitely part of it. I think a lot of the big tech companies are probably really where it kind of started. But I think the thing there with financial companies is that they're super worried obviously about cybersecurity risks both from the perspective of the literal cybersecurity risk of, "Hey, somebody could be stealing money, stealing intellectual property, stealing customer data." But also, from the very real sort of regulatory risk of, "Hey, if it turns out you were not doing the right sorts of things, not only do you get hit because of the reputational hit, the hit due to lawsuits or whatever. But, also, now you have to deal with fines coming from the government as well." 

And so, a big thing that's been really pushed on a lot of developers, a lot of DevOps folks is like how do we make sure that we're still doing all of the right things we can to deliver features to our customers, provide good services while really making sure that we're protecting everything we need to protect? Because it's a huge risk.

[0:07:57] GV: Exactly. I think at the moment we're just now seeing that come through to so many other industries. And most companies are now getting quite concerned about it, which is why every developer out there needs to be concerned with it. My understanding is that Kusari is commercializing the well-known open-source tool GUAC. It's G-U-A-C for those listening. Can you tell me a bit more about what is actually that tool? And what does GUAC stand for? 

[0:08:21] ML: Sure. Yeah. I'll start with what GUAC stands for. Like they always say, the hardest problem in computer science is naming things, right? What had happened was, actually, Kusari owes a lot of its success to open-source. And so, a lot of our work within the open-source space has led us to actually partner with great folks we would not have normally been able to partner with. 

And so, let me just actually start off by just sort of saying GUAC stands for the Graph For Understanding Artifact Composition. It's a backronym. Because there's another supply chain security framework called SLSA that we do a lot of work with. And so, at the time, we were thinking about different names. We're like, "SLSA. GUAC and SLSA go well together." That's kind of the name there. 

And then SLSA originally came out of Google. Google had contributed it to the Open SSF. And SLSA is a supply chain sort of security framework that supplies a lot of providence metadata. And so, GUAC came in as ingesting of that metadata. 

And so, really, a lot of this started off from I had known some folks within the open-source community like Brendan Lum, who at the time was a co-chair of the CNCF Tag security. And he was working at Google. And we both were talking a lot about stuff like, "Hey, the need for understanding the supply chain through graph." 

And in fact, there's a bunch of research work on this and a bunch of articles on this thing. There is actually a really good article from a man by the name of Jacques Chester, who he had wrote up this idea around what he called the universal asset graph. Where, hey, if you really want to understand your software supply chain, you really need to think about it as a graph. X depends on Y. And Y depends on Z. And X is installed on this server. And all that sort of stuff. 

And so, a lot of other folks had also looked at similar sorts of things. If you look at the Bazel build process or NX and NX-OS, the package manager in an operating system, they view a lot of that sort of same stuff as this sort of directed acyclic graph. Where if I'm about to install a piece of software, you're looking at like, "Okay, what does it depend on? And do I have that package already? Or do I have to build that package and that sort of thing?" 

And so, we were thinking about that same sort of problem. Us, Kusari, who had just been founded at that point, as well as some of the members – some of the folks over at the Google Open Source Security team, like Brendan, we were starting to kind of talk through this idea of like, "Why don't we build that graph a little bit? Why don't we go and build it but not from the perspective of, "Hey, I have a bunch of I have it figured out because I already know all this information." But why don't we start taking all of this great supply chain security metadata that's out there and start to pull it into this graph? 

And so, for folks who are not like super familiar, there's this idea of what's referred to as a software bill of materials, which just is this idea of, "Hey, let's look at what are the dependencies of your piece of software. And how it was built? And whatever." And list all the dependencies within that piece of software along with potentially even transitive dependencies as well and begin to sort of build out this bigger picture of what that software is made up of. Because you want to be able to know in the future, like if it turns out there's a vulnerability in a particular dependency, you can figure that out. 

As well as there's a bunch of other information that's out there as well. Things like SLSA attestations, which are sort of just information about the providence of the software. How it was built? And there's all sorts of other information that's out there. Like security scans, license information and all this great stuff that a lot of companies really want to better understand. 

And what often happens today is they look at it just purely as a moment-in-time-scan. And, really, the thing that we saw with something like GUAC was, "Hey, what if we just really created this database that we could keep updating with additional information? And not just additional information, but information coming from different sources." 

And so, we can even understand where the information might contradict and all that great stuff. So that somebody who is looking at their software supply chain, looking at what software they're using within their organization, within their project can really understand what's actually going on in there. And so, that's kind of the how sort of GUAC got started.

[0:12:44] GV: Very interesting. And SLSA, which, again, for listeners, that's SLSA. And I didn't even know the GUAC and SLSA bit. That's fun. SLSA has different levels. Can you speak a bit to that? And how does GUAC interact with those levels or not? 

[0:13:00] ML: Sure. Yeah. SLSA, for folks who aren't aware, is a supply chain level for software artifacts. And really, the idea behind SLSA is we want to secure obviously artifacts. We want to know that, "Hey, a particular artifact is – it hasn't been injected with malware and all that sort of stuff." 

And so, really, a lot of it came from this idea of, "Well, we don't even know." We don't know where software came from. And so, there was this need for providence. And providence just means where something came from. And so, there was this idea of – and it came from Google originally, but contributed to the Open SSF, this idea of sort of building out security levels based on what we know about the artifact and what we can trust about the artifact. 

And so, one of the things was – there's a bunch of different things in there originally. We've sort of simplified it for version 1.0 to focus purely on the build right now. Though we are expanding out. And for folks who aren't aware, I'm also a member of the SLSA Steering Committee. I'm very heavily involved in there as well. 

But what we're doing is 1.0 really is about build providence. And what does that mean? It's about, "Hey, what sorts of things can I do to feel more confident that this piece of code as well as these dependencies ran these commands to generate this output artifact?" Right? 

Because there are a lot of these attacks that are out there where you don't even know, "Hey, typosquatting and repo hijacking." Where you download a particular package, you think you're downloading let's just say requests, the popular Python package, and you end up downloading something else inadvertently because you had a typo in there and that sort of thing. 

SLSA is focused on sort of that providence piece. And so, really, there are today three levels. And the three levels are – level one, which is just more or less record the data about the artifact that you built. That just pretty much means record the output hash. Do a checksum of the artifact that you built as well as much information as you can include in there. So, like about the build process. About the dependencies that are in there. That's level one. And, really, it's not much different than just sort of logging the stuff. But in this case, SLSA has what's referred to as an attestation format, which is just a JSON document with a couple of fields in there that, because they're well-known, it makes tooling to ingest it a lot easier. 

Level two is then taking that information and essentially associating it with an identity through a signature, right? In the case of level two, it could just be as simple as signing that attestation so that you know, assuming somebody hasn't stolen your key, that, yes, you're the one who signed this artifact. 

If somebody else claims that they've built your package, if they don't have your signature on it on the attestation related to it that has that output hash, you should be suspicious. You should be like, "Hey, what happened here? I thought you were generating SLSA attestations?" And it doesn't look to be signed with your well-known key. 

Then level three comes in, and that's kind of where, level two, there's a big risk of your key potentially getting stolen. There are also a lot of risks around in level two like somebody who hijacks a build using the signing secret to sign whatever they want. Level three does a lot more where it tries to really focus around making sure that the signing secrets are only used by what's referred to as sort of the trusted build control plane. Think of it as like if you're using let's say GitHub Actions – or actually, let me use an even more common one. Jenkins. 

The idea would be you don't let the actual Jenkins build, the builder itself sign it. You make the Jenkins orchestration system actually do the signing. And so, that's kind of where level three goes. And there's a lot of other things that are sort of being talked about for the future around reproducible builds, around hermetic builds, around sort of utilizing single-use signing secrets. 

For folks who are familiar with Sigstore, for example. There's a lot of work on that front. And actually, some of it's already built out that allows you to sign with single-use certificates based on your GitHub identity. You can sign a build that can be traced to the specific GitHub build that you actually ran. And assuming you trust GitHub and then that GitHub hasn't been compromised, you can tie back all of that build information directly to the build that happened in GitHub while recognizing that that signing certificate was only ever used to sign that one build. 

[0:17:50] GV: Yeah. I mean, before we maybe touch on more how a developer would actually start getting it up and running with GUAC, it does seem that software supply chain has been overlooked really until now perhaps. But that's myself coming perhaps more from the pure security side as opposed to looking at the software supply chain side. But you've been involved in that space for longer. How do you see that? 

[0:18:14] ML: Yeah. I think, for a while, people have used the term third-party risk management. A few other things. Dependency management, vulnerability management. And I think what's happened is supply chain security is about securing the production and consumption of software, which is a big thing, right? There's a whole lot of stuff in there. 

And for a really long time, I think people viewed each of the different things in there as their own disparate pieces. They viewed them as like third-party risk management or dependency management for pulling in dependencies, build security management, endpoint security, all those different things as being their own piece. 

But the problem that's happened is you can't solve the problem unless you really look at it in that holistic way, right? Because all sorts of things can go wrong at each stage of the process, right? If we start from let's say the developer, you could have an unapproved developer pushing code to a code repository. 

And traditionally, that's just been, "Well, we just have IAM for our services." But let's say they did get in there. They were able to push something in there. Something went wrong with IAM. Well, now they're in your network. Similar to like a zero-trust thing. Like, "Hey, they're now in your network and they've now pushed bad code." And your build system is going to build that code regardless. And, potentially, you're going to package it all up. And now you've – because a single sort of break somewhere in that system has now led to you developing a malicious artifact. 

In the case of the SolarWinds attack, the build process had gotten compromised where a DLL was sort of swapped at the last second kind of thing and that led to every single artifact that was being built now was malicious. And it was very hard to detect. It was very hard to see. And it was like, "Well, it wasn't the packaging that was the problem. It wasn't the source code." It was like this whole thing that you had to look at holistically. 

And so, when we kind of look at the end-to-end thing, there's so many things that can go wrong. An unapproved developer can get into your source code repository. Your source code repository itself can be hijacked and have malicious code injected. When you're about to run a build, your build can pull from the wrong location, right? Pull from an unapproved source code location. 

When you're pulling in dependencies, you could pull in malicious dependencies. Your build system itself, in the case, going back to the SolarWinds thing, can itself be hijacked, right? The build system itself can be hijacked. And then when you go to actually package up and push packages, your package repository can potentially get packages pushed from places that aren't the build system. 

And then, finally, when you're going to go let's say download those packages and let's say deploy them into your environment, those same things can actually – you could be pulling from the wrong location again. You could be pulling bad packages and all that sort of stuff. And it's very, very difficult to solve the problem if you're just solving each individual slice. You have to think about it holistically, which is where I think supply chain security really kind of came in there.

[0:21:18] GV: I mean, I think, perhaps one of the examples of that that might resonate with the listeners, I'm quite familiar with the JavaScript and the TypeScript community and there was that case where, for example, an NPM package where the name had not been taken, I believe. And so, you're able to come and suddenly insert a package that was a commonly used name. I believe it was FS perhaps. But I could be getting that wrong. But is that kind of what we're talking about here? The idea that you may just be using something and haven't thought twice about it. And then, suddenly, unless someone has come and kind of protected that, then you can end up pulling in code that you actually have no idea who it's from.

[0:21:57] ML: Yeah. And I think, actually, the Node, TypeScript, JavaScript community, that that's a fairly common problem today. Because it's very easy to just sort of pulldown arbitrary dependencies. And some of those arbitrary dependencies are transitive dependencies. It's like you rely on something. And that relies on something. And that relies on something. 

And then somewhere down the line, you're relying on something that is you think it's a good package, but it's not for whatever reason that may be. It could be that it was hijacked. And we've seen folks steal credentials. We've seen also protestware as well of folks who've, for one reason or another, been upset and decided to inject malicious software into their own package or otherwise hijack their own package. And there's a lot of that sort of stuff. 

And so, I think the thing there, when it comes to the sort of supply chain security, is it starts with knowing what you have. Because if you don't know what you have in the first place, which is very common and you don't know where the things came from, then it's really hard to actually secure the thing in the first place. It's really about, first, knowing. And then saying, "Okay, is everything doing the right –" now that I know where it came from, do I trust the places where it came from?

[0:23:08] GV: Exactly. I think it'd be good to get into a little bit how would a developer, let's just say no real kind of security experience. I assume GUAC is a good place that they could start in this sense. And so, how would they get up and running so to speak with GUAC? 

[0:23:24] ML: Yeah. GUAC – and the website is GUAC.sh. There are docs that are linked in there. We have both a Helm chart as well as a Docker Compose. And we've actually been developing some YouTube videos to help walk people through it. But it's actually super simple to get up and running. 

For folks, GUAC is still right now beta. We're sort of building it up to be a more production-ready in the next couple of months. But, really, it's just – in the case of if you have Docker and Docker Compose, you can just sort of run the Docker Compose, which spins up essentially a backend. In this case, an in-memory backend. But we have a bunch of persistent storage options that are actively being developed. But it spins up a backend as well as a GraphQL API. 

And really, the core of GUAC is this GraphQL API, which, once again, your supply chain is pretty much a directed acyclic graph. Hey, why not use something that like API built for graphs? So, GraphQL. And so, the GraphQL allows us to do a lot of different really, really nice things. It allows us to have multiple different storage backends and that sort of thing. It allows us to also create lots of nice integrations.

But, really, that's all it takes to get set up. And then you can start to point – we have a bunch of different ingestors that can be configured via the command line as well as a bunch of other things that are being developed as well. But the ingestors will pull down code from places like an S3 bucket or a Google Cloud bucket. It can also pull from HTTP as well as Git and some other places. 

For example, GitHub releases, we can pull down the SBOMs from there. Also, local files. If you want to just test it out, you could pull from your local files. And you can pull in all sorts of the metadata that we support. And right now, we support things like SBOMs in both CycloneDX and SPDX formats. We support SLSA attestations. If you have a SLSA attestation, we can pull that in as well. And we can pull in VEX statements. And VEX statements, for folks who are not super familiar, they're essentially – they're statements about whether or not a particular piece of software is actually vulnerable to a vulnerability, right? 

There's a lot of vulnerabilities that go out there and it's like, "Well, actually, we're not using this particular feature of our dependency that is vulnerable. We're actually not – we can't be exploited." We are able to ingest that as well. As well as pulling data from a lot of different open-source databases, including OSV, which is an open-source vulnerability database. As well as deps.dev, which is a dependencies database that has a bunch of information about dependencies. 

And so, really, that's as easy it is to get set up. And then there's also – right now, it's an experimental visualizer that lets you actually sort of see in a graph format like an actual sort of visual graph of like I have this package relies on this package. And that package is vulnerable to some vulnerability. And here, if you have a common base image, this is where you would update to sort of fix that vulnerability. 

[0:26:22] GV: Very cool. And given the open-source fundamentals of this, I think you've mentioned you've got some people kind of playing around and making some cool tools. For example, I think you mentioned a VS Code extension, which isn't in production perhaps. But I think that was a really nice example of kind of where this could start to get integrated a bit more seamlessly if you like, into, say, the overall developer process.

[0:26:48] ML: Yeah. Yeah. One of the big you know things that we want to come out of GUAC is we want to put the data in the hands of the folks who need it in the format that they need it. For example, for developers, I've been a developer who's worked at a lot of big banks and a lot of other places where the worst is when you get the tap on the shoulder the day before a go live and it's like, "Actually, you're using a bad library somewhere. It's super vulnerable." And you go and you ask people, "Well, how was I supposed to know that?" "Oh, yeah. There was some scan that was run, but you never got the information – you never got that forwarded to you. So, there was no way you could know." That's kind of where we kind of pull that. We try to provide that to developers. 

Because, hey, as a developer, if you told me on day one, "Oh, no. That library. Don't use that library. Use this other one. Because that's a much better library. That's going to be approved by your organization." Great. We're looking to do that. We're working to try and get it integrated with various artifact repositories so we can actually pull additional information while also potentially being used to action on like should we block this from being downloaded and that sort of thing? Working with the build systems to integrate there. 

And, actually, an interesting one that some of the folks over at Microsoft Azure, they recently showed up to our community meeting and showed off this amazing tool that they call GUAC AI Moly, like AI Moly, where they integrated GUAC with ChatGPT actually, so that for folks who don't want to write GraphQL, you can write a prompt that you might be able to say something like, "Hey, do I have a Log4Shell vulnerability somewhere in my supply chain? ChatGPT will then transform that into the GraphQL, run it against GUAC. And then that same response will then be forwarded back to ChatGPT to then turn it into human-readable sort of output of, "Yes, you have Log4Shell because of a transitive dependency over here." That we found really, really cool as well. 

And so, yeah, there's a lot of integrations. And given that it is open-source and it's actually just recently as of this week got accepted as an incubating project into the Open SSF, it is being transferred over to the Open SSF. And we have a lot of folks building a lot of cool features for it. A lot of great integrations as well. 

[0:29:04] GV: That's really interesting. And this isn't an episode at all about AI or GPT, but I think it is pertinent to call out that this is where things like GPT can really start to help depending on where they get integrated. Because, say, the average developer, just getting up and running with, A, another API, yeah, it can be fast. But, also, there's sometimes a limit to that. Being able to sort of humanize the inputs and the outputs is a huge bit here. And anything that can add to the sort of adoption rate is only positive, I imagine. 

[0:29:38] ML: For us, I think the thing is there's going to be folks who are going to learn the GraphQL. They're going to know all that sort of stuff. There are going to be folks who are going to interact with the various other tools and APIs that we have for GUAC. But then there's also going to be the folks who I'm just going to throw out there, like your CISO, who maybe the last time the CISO has really written code has maybe been a few years. And they just want to be able to write the very quick prompt that says, "Okay, how pwned am I?" And be able to kind of get that answer in a good way really quick is super useful. 

And, yeah, I think AI can definitely help out a lot there. And then, also, I think one of the things that we're also looking with GUAC is to try and also help secure the supply chain of AI and AI models. Because there was actually a really interesting talk that saw recently at Open SSF Day in Europe. And over there, it turned out a lot of the very common open-source AI software and AI models have a very bad security posture. And it seems like they're very much at risk of potentially containing malicious software or being hijacked in some way as well.

[0:30:45] GV: Yeah. That is almost a whole episode to itself probably. 

[0:30:48] ML: Yep.

[0:30:50] GV: I'm actually curious. Looking at say the other side where let's say I'm a developer, I want to create packages. What kind of considerations am I perhaps now thinking to take into knowing that things like GUAC are coming along and going to start taking a much more deeper look on how I've put together my package and the attestation, et cetera? Are there any sort of advice for those that are more on the production side? 

[0:31:15] ML: Yeah. I think so. Because I think the thing that we've been telling folks is, if you're a producer of software, your software is part of somebody else's supply chain. What can you do to help them secure it? 

And so, there's a lot of different things that are coming out of a lot of different places, especially in open-source like the Linux Foundation, like Open SSF. And so, actually, there's a really great thing that's happening right now called Security Slam, which is being put on by Sonatype and the Google Open Source Security Team. But anybody's allowed to participate. It's pretty much just a set of practices to start increasing the security posture of your software. In this case, open-source software. But I think the practices that are in there can be applicable whether or not your closed-source, open-source, anything in between. 

And, really, I think the things are to start looking at the documentation that's coming out of the Linux Foundation like the CNCF whitepapers, the Open SSF best practices, scorecards, et cetera. 

Just as an example, some of I think really cool practices that are out there are there's a thing called Open SSF Scorecard, which is essentially a GitHub action that can run on your GitHub and essentially scans your source code repo for all sorts of different things that maybe are suspicious like a GitHub action is poorly formed that could potentially hijacked. And there are a lot of other things in there about like do you have a valid license? And all that sort of stuff that you would expect or you'd hope would be in there. 

And there's a thing called security insights, which is coming out of the Open SSF, which just hit 1.0, which is just kind of like a file that you, if you're let's say an open source provider, you could just sort of put in a bunch of metadata about who are the maintainers of a project? Where does the vulnerability disclosure policy live? Because sometimes it's just like where does that stuff live is the first step. And then, you know folks can kind of start to poke around with it. 

I think like starting to address those things and starting to do things like generating an SBOM. Even if it's not a completely accurate SBOM. Having something is better than nothing. Because even just the practice of it is helping increase security posture. 

And in fact, actually one of the things that came out recently from one of the folks at Sonatype about the Security Slam was just folks who were doing the right sorts of things like generating SBOMs, generating providence for their software, signing their software. Doing these sorts of actions, they found that their software tended to be more secure. That vulnerabilities were being addressed at a faster rate. I think just getting into that hygiene is a really good thing. 

And then from the GUAC end, we want to ingest all that great data to help, obviously, the consumers of your software, the consumers of – the producers of the software. We want to make sure that the consumers have all the information they need so that they can trust your software. That they feel like you're doing the right things from a security standpoint. That you're not going to somehow get them pwned. You're not going to get them fined by the government. You're not going to get them sued because your software led to a breach.

[0:34:16] GV: And you've touched quite a few times on Open SSF. And, obviously, congrats on the incubation project there. And that's actually how we met. Myself going to the Singapore chapter. I believe you're part of the East Coast or the New York Chapter. Is that a good way to describe it? 

Perhaps for those not super familiar with Open SSF, just a brief what is it really? And it sounds like it has helped a lot through your journey. And how has it helped?

[0:34:42] ML: Sure. Open SSF stands for the Open Source Security Foundation. And is part of the Linux foundation. It's a child Foundation underneath the Linux Foundation umbrella. It is very much focused on open-source security. And so, that includes things like open-source supply chain security, open-source application security. All that great stuff. 

And it is very much focused on developing things like specifications as well as tools to help the security posture of open-source. To go back to some of the stuff I was talking about earlier, is if the vast majority of source code, even if you have a close-source application, is actually open-source. Because you rely on the open-source. You want to make sure that you can actually – that that open-source software is secure. 

And so, there's a whole bunch of initiatives in there as well. There's one called Alpha Omega, which is focused on securing the most critical of open-source projects. Some of the things that I even talked about before, there's like a tool called Scorecard. That's under Open SSF. And so, Open SSF has a lot of different things and a lot of different working groups as well. 

And for myself personally, it's a really great set of folks. A lot of volunteers who are building great software who are having a lot of the great conversations that then go on to other things. In fact, some of the work within the Open SSF has been cited by NIST in some of their documentation for supply chain security. It's been really good on that front. 

And then, also, from a business standpoint, it's been great for us. Because, hey, by sort of doing our work in the open-source space, collaborating with the Open SSF. We're a small company. There are only seven of us, including our intern. And we're able to collaborate with folks like Google, like Microsoft, IBM, Red Hat, VMware. All these like giant companies that, if we were closed, if we were more closed-off, we wouldn't be able to actually be collaborating with them on some of these larger initiatives.

[0:36:42] GV: Yeah. I mean, I think what I found with Open SSF is one of these places where you realize you've got a friend in the space, where you're a developer and you're trying to explain to your company, your employer or your team even just how important open-source is. But you rely so much on other people's work effectively. But there are obviously inherent risks with that. I think they're doing a great job. And, obviously, glad that we managed to meet through that. 

I mean, just looking ahead, GUAC and Kusari as well. What are you building towards? What's the next let's say three to six months? What are you working on?

[0:37:17] ML: Yeah. We're building out a hosted as like SaaS service for GUAC. Once again, obviously, GUAC right now can be deployed by anybody. It's open-source. But with all things, not everybody wants to hire a whole team to let's say manage a massive graph database and do all that. Whereas, hey, we're the experts in that. We've built out a SaaS based around sort of running GUAC and making it super-efficient. Making it super easy to use for the end user who doesn't want to have a team of DevOps, DevSecOps folks run it for them. That's kind of really the start. But we're also obviously building on top of it a bunch of like nice, more proprietary features around compliance, around some other nice little things that maybe it's a little harder to do in the open-source space. That's really sort of in the short-term what we're focused on. 

Slightly longer, one of the things that we are looking to also drive in the open source is there's this Open SSF project called the Security Tool Belt. Right now, there's a bunch of different – it's going through a bunch of different names as originally was called the Sterling Tool Chain. But it's going through some evolutions. And that is really around trying to build the definition of a secure supply chain architecture. 

Really about what does a secure SDLC look like? And then how can different tools plugin? Because we view sort of GUAC as just a database that a knowledge graph that can then be used in all the other aspects of your SDLC. It could be used to help inform your endpoint protection about what's allowed to be downloaded. It could help inform your build to make sure like, "Hey, have I done all the right things for a build?" And then when you're ready to deploy, do I have a valid SBOM? Do I have a valid SLSA attestation I trust? And all that great stuff. 

And so, we're looking at sort of this architecture helping sort of drive GUAC to essentially be that sort of database that everybody uses to better understand their software supply chain and then all the great integrations that kind of come along with that. 

[0:39:24] GV: That's really exciting. Very cool. And where can people find you? And I believe you also have a book coming out. What's the focus of that? 

[0:39:32] ML: Yeah. That book is – it's Securing the Software Supply Chain by Manning. I am a co-author along with Brandon Lum on that book. And the book is very much focused on a lot of what I actually talked about today with the example we use throughout the book is of a bank. And of a bank that is trying to secure its supply chain. To secure its SDLC. 

And so, we look at a lot of the different – and I haven't even mentioned so many of the tools that are involved in the software supply chain security space. But we look at stuff like how do you threat model, your SDLC, to better understand where the risks are? To understand what you need to sort of protect? And then how do you start to protect it through all sorts of tools and technologies? Not just GUAC and SLSA? But tools that are coming out of, for example, CNCF? There's a tool called in-toto. And in-toto is a specification around – it's essentially a way of you specifying the policy. How you create build and produce software ensuring essentially that those steps that are part of that policy are actually followed? 

And then there's another one called TUF, also known as The Update Framework, which is a specification around how do you ensure that updates to software are deployed securely? How do you prevent somebody from tricking you into thinking there's been no update? 

And so, there's a lot of stuff in the book that goes into a lot of that. We go into a lot of stuff around policy and thinking about how do you essentially continually validate that, yes, no new vulnerabilities have come up. No new vulnerabilities have been discovered. If something has been discovered in your supply chain, how do I remediate it? And that sort of thing. 

And so, we really dive into every sort of stage of the SDLC from planning, all the way out to deployment and maintenance of the software. And we look at deeply what sorts of things do you need to do and that sort of thing. And so, that's coming in Manning. Right now it's in what's referred to as the Manning Early Access Program. 

[0:41:32] GV: That's fantastic. I mean, all I can say for developers is always, yeah, think about it from start to finish. Not just the security at the end of the process. I'll certainly be reading that book. And I think I'm sure a lot of the listeners will as well. 

Michael, it's been such a pleasure to speak to you today. I think I've even learned more today. And I'm sure a lot of the listeners out there have as well. I just want to say thanks so much. And hope to speak to you again soon. 

[0:41:59] ML: Yeah, of course. And thank you for having me.

[END]