EPISODE 1611

[EPISODE]

[0:00:00] ANNOUNCER: Containers make it possible to standardize the deployment of software to any compute environment. However, managing and orchestrating containers at scale is a major challenge. Kubernetes was originally created by Google and solves the problem of scaling container deployment. Ben Elder is a senior software engineer at Google and as an elected member of the Kubernetes Steering Committee. Ben joins the show to talk about why Kubernetes became the standard for container orchestration, Kubernetes control theory, how he runs his home infrastructure, and more.

This episode is hosted by Lee Atchison. Lee Atchison is a software architect, author, and thought leader on cloud computing, and application modernization. His bestselling book, Architecting for Scale, is an essential resource for technical teams looking to maintain high availability and manage risk in their cloud environments. Lee is the host of his podcast, Modern Digital Business, produced for people looking to build and grow their digital business. Listen at mdb.fm. Follow Lee at softwarearchitectureinsights.com, and see all his content at leeatchison.com.

[INTERVIEW]

[0:01:21] LA: Ben Elder is a senior software engineer at Google, and an elected member of the Kubernetes Steering Committee, and he is my guest today. Ben, welcome to Software Engineering Daily.

[0:01:31] BE: Thank you.

[0:01:32] LA: It's great to have you here. Kubernetes has really taken over the world of container orchestration. But it really isn't the only option. Why has Kubernetes done so well, compared to things like Docker Swarm, or AWS ECS, or any of the other container orchestration services out there?

[0:01:51] BE: I think it's a complicated topic. I think some of them, it's because we have an open, portable standard across providers. Some of it's about community building, and some of it's about the design and approach to it. The way that Kubernetes is a distributed system where you declare the state that you want, and controllers reconcile towards what you want, builds a resilient approach to doing these things. But I think it's also about the community building that people, folks, and just so many aspects.

[0:02:18] LA: I think that's a great point. We'll talk a little bit later on, well, I want to talk about the APIs and the mechanism that uses for APIs and how that works. But let's talk a little bit about the community aspect first. What is it that keeps Kubernetes going today? Is it the community? Is it Google? Is it a combination of both? What is it that's driving the adoption of Kubernetes?

[0:02:42] BE: Well, I think that some of these things do come down to the technology to build something useful, and people will come. But there's a lot to the community building, and that includes people from the CNCF, from Google, from other vendors, and independent folks participating in the community. I know, within the Kubernetes project, we have whole groups dedicated to aspects of community like the contributor crumbs group that works on sending out communications through social media platforms about what we're working on, or the Docs team that works on the blog that we host ourselves within the project.

There's a lot of work that goes into that sort of thing. We host our own discussion forums, and we participate in others. It's really important to have all of that because it's about ecosystem. I think, all of the technology and so on a site, the biggest thing that Kubernetes has at this point is that the whole ecosystem around it, and that comes from community.

[0:03:36] LA: An ecosystem is not only plugins and things like that, but it's also companies that have built their entire corporate strategy around supporting people who are using Kubernetes environments. So, there's a lot of support from third-party companies with products and offerings that are compatible with Kubernetes.

[0:03:56] BE: Yes, and that's a good point. I mean, one of the things that I've been close to that we've worked on has been the conformance program that's about making sure that we actually do have compatible offerings from so many vendors and that they meet a certain expectation of standardization.

[0:04:10] LA: Well, when I think of one, some of the greatest foundational value that comes from Kubernetes, and what is it a Kubernetes provides, and it's a kind of a long and kind of a complex list. But one of the big things, overriding things, if you will, that I think of when I think of the value of Kubernetes is it's really an enabler of infrastructure as code for infrastructure management.

Now, the cloud certainly was a big part of that, and Kubernetes is an independent dimension from the cloud. But Kubernetes on its own, provides the ability to build infrastructure as a code offering that can help companies whether they're in the cloud or not, achieve code-based infrastructure. Is that a valid statement? Or do you agree with that statement, that that's a foundational aspect?

[0:05:00] BE: Yes, I think that's totally fair. I think that one of the things that does is ensure that you have a set of common APIs available that you can use for core infrastructure concepts, like your compute workloads, aspects of networking, and storage, that you can target, be at bare metal, or some on-premise solution, all the way out to all the different cloud providers, and pretty much anywhere that you can git, compute, storage, and network infrastructure.

[0:05:25] LA: It's more than the ability to get it. It's the ability to manage and control it. The real value of IAC is the manageability of the infrastructure that you're building, and the consistency of the infrastructure you're building, and the ability to treat it like you would any other piece of code that's managed, and deployed, and viewed in a standard way. That's something that Kubernetes supports and actually encourages, I should say. Is that correct?

[0:05:55] BE: Yes. I mean, I guess what I was getting I was having APIs gives you room to build IAC, but it certainly has it in mind from the beginning that you're going to - even more particular flavor, Kubernetes really pushes the declarative approach to infrastructure config, and that's kind of interesting, because you don't have that. You can use imperative code to omit declarative, but it's much harder to go the other way, and it creates room to build this IAC ecosystem around, "Okay, you're going to have these APIs to do IAC for on-prem, and cloud, and so on." The whole system works that way and everything that we run for ourselves kind of meta in the project, which is one of the things I participate in, uses pretty much every form of IAC you can imagine, from Bash o Terraform, or Kubernetes YAML, depending on the scope of what we're running.

[0:06:47] LA: You mentioned declarative versus imperative. I know many of our listeners understand the differences between those. But do you want to talk a little bit about that, and what the difference is, and why declarative is better?

[0:06:57] BE: Well, I should say, I think declarative, it fits for this use case. There's plenty of reasons you might want to do imperative. So, declarative versus imperative, with declarative state, you're spelling out as data as config. This is what I want the world to look like and with imperative, you're like writing code, trying to drive things towards yourself. I think the really powerful thing about declarative is that it allows us to centralize that controller logic, to take a control theory approach to we're going to get things to the state that you desire, and we're going to implement that with all of its quirks once, as a common controller for something like Kubernetes services, or the kubelet that runs pods.

Then, all of these other tools can just tell Kubernetes, this is the state we want. Those controllers can continue to drive it. It's an eventually consistent system with large, complex environments. If you try to do everything sequentially, imperatively, you can run into problems with ordering on dependencies between things, or transient flakiness by moving that logic out into a daemon, a controller, that continues to drive towards that state. You'll eventually get to what you need with the resiliency that comes with that. I think that's a really powerful way for running infrastructure, and it's much harder to get right with a purely imperative approach.

[0:08:23] LA: Arguably, I think you mentioned this, using imperative techniques, it's almost impossible to get a declarative system. It's a little bit easier the other way around. But it's almost impossible using just a declarative style, to truly get what you're trying to get towards. Well, it's so much simpler to say, these are the set of systems I want, this is what I wanted to look like when it's done, and let the system figure out the optimal way to get there.

[0:08:49] BE: Right. And it allows us to hand that off to like common upstream APIs, and in many cases, to common central implementations that people can work on together to get to a really reliable state where we continue to squash bugs and handle quirks of these systems.

[0:09:05] LA: So, how does Kubernetes work with other IAC tools? Or the IAC infrastructures like Terraform, et cetera? How does it work with them to create a system as a whole? They're not competitive. There are tools that work together. Is that correct?

[0:09:19] BE: Yes, absolutely. As I said, within the project ourselves, we use a combination of all these tools, whatever seems effective for what we're trying to do. For example, if you're trying to bring up a cluster itself, you might want to use Bash or Terraform or something. But once you have the cluster up, maybe you can do it with pure Kubernetes YAML. Maybe you want to have a combined with the rest of your stack. There are Terraform providers or other tools. We have a whole ecosystem of tools even built since Kubernetes was created to let you do this the other way and start from Kubernetes-style APIs and move them out to all the rest of your application. We call it, Kub RM, Kubernetes Resource Management for all of your throw things and Kubernetes has built a platform where if you want to extend it to support all these external APIs and infrastructure, we have common tooling for that, and you can build APIs in Kubernetes. But we also see people going the other way and building tools, using tools like Terraform that were already existing that they were already targeting all of these external things within adding Kubernetes to it. There's a really, really rich API and tooling ecosystem in both direction, and it's a combination of Kubernetes and self-preventing good APIs, Kubernetes providing extension mechanisms for people to build their own APIs, and just the ecosystem around it.

[0:10:39] LA: That's really the difference between Kubernetes and for instance, ECS, just throwing out one of the examples. It truly is an open environment that has community support, and lots of third-party integrations that work with it, and make your work better.

[0:10:56] BE: Well, and it's something that you can - depending on exactly what you're trying to do, you may be able to define how a common way to run some application and have it just work on cloud, on-prem, on another cloud, because of that neutral API that isn't specific to one vendor.

[0:11:15] LA: Are there classes of applications, size, or type, or whatever, where you think that Kubernetes is the wrong answer for and another tool is better.

[0:11:27] BE: Oh, yes. I've actually written about this before. When I was newer to these things, I was excited to run Kubernetes everywhere for everything, and my home environment was all Kubernetes, and super complicated, and I've since realized that while I'm not running very many workloads at home, they're really simple things, and they run just fine as services on one appropriately sized box. I'm not that worried about availability. This isn't a business. And it's not worth my time to overcomplicate it.

If you're running a very small amount of things, and you know how to run them well, you might be able, I mean, you can get a surprisingly large VM. I think, within this community, probably a fair number of folks have heard about how Stack Overflow runs things and how they just have a handful of big machines that they run things on. If you're doing that, and you don't have too many services, you might not find that you need this whole complex system. I think where Kubernetes really shines is where you want to leverage the ecosystem and its tooling. Or if you're running lots of workloads, or a lot of services that are talking to each other, and that sort of thing.

But if you've got like one service, you might not use it. I mean, the Kubernetes project itself, we haven't quite gotten to the point where literally everything we run is Kubernetes all the way down. Because same thing, it's operated by volunteers, and we don't have the resources to run an ops team. So, I think where Kubernetes also really shines for larger organizations, and we might get to this with some other discussion later, this is a little bit of a tangent. But Kubernetes lets you sort of build your platform, and if you're not at the point where you need a whole platform If you've got like, a few small services, it's probably overkill.

[0:13:13] LA: Is there a sweet spot?

[0:13:14] BE: Yes. I think once you're starting to expand to some handful of services, and you're starting to think about how you need to manage them, as I see that you want to be there. If you're running one or two things, and you think you can get defined just kind of running them by hand, you don't need - so, my home lab I've just got a couple of Debian packages on a Debian box, and they run a system to units and I almost never touch it, and it's just not running anything big. Or if you can get a managed service that solves your needs, it may be a good fit. Again, we use some of those for running Kubernetes, because they're just the right fit. Because of the scale, the amount of people we have in the workloads running. If we've got like blob storage, we're not going to spin up a blob storage implementation when it's just easier for us to use one off the shelf, and it's not that important to us, because there's already something that has a de facto standard, and we can swap vendors and so on as we need.

But when you're really actually starting to run lots of things, and you're starting to think about, "Okay, how are we rolling things out? How are we running things together? How are we managing our infrastructure that isn't just one box in a corner?" Then you want to start thinking about it. But if you're a startup and you're just running on one machine, and that's working fine for you, you probably don't need the overhead yet, and you'll be fine. But when you start thinking about how you're going to do ops and automation, and you want to leverage all those IAC tools, or you start to scale to more services, then I think you're going to start to find Kubernetes valuable.

[0:14:44] LA: So, is Kubernetes by itself more of a straightforward infrastructure play? Or is it more of a platform? I know that's kind of a loaded question because it depends on the definition, so feel free to define those terms, however, you feel appropriate.

[0:15:01] BE: I think a lot of these things are relative to part of the stack that you work with. But I'm not sure who originally said it. But someone smarter than me once said that Kubernetes is a platform for building platforms, and I really liked that way of putting it. I think Kubernetes itself doesn't give you a full stereotypical platform as a service expectation. It isn't quite that level of opinionated and batteries included for point, get a repo at it, and you're good. You're going to do more of that.

But what it does give you is a portable set of APIs for managing your infrastructure. When you want to platform as a service, you want a little bit more than that. You want more opinions around how you're going to do things like application rollouts. And Kubernetes provides a lot of the like, base needs for that. But the ecosystem around it, or what you choose to build yourself, in some cases, is what will provide the rest that gets you all the way to a full path-style thing. So, I think Kubernetes allows us to build your own pass, anywhere.

[0:16:11] LA: Yes. So, it's playing in the pass area, but itself is more of an infrastructure as a service.

[0:16:17] BE: Right. In itself, provides all the foundational APIs you need for I'm going to run workloads. I need to have network for them. I need to have storage. And this is how the workloads need to scale across nodes and things like that. But when you start talking about things like blue/green deploys, or something like that, then that's going to be your toolkit of choice from ecosystem building on top. Maybe, in some cases, companies are going to build something themselves. In some cases, they're going to look to other CNCF projects or ecosystem tools, projects like K native.

[0:16:50] LA: So, let's start talking about the Cloud Native Compute Foundation. We've mentioned that briefly. Google played a major portion in starting and the creation of a Cloud Native Computing Foundation. Kubernetes, obviously was the foundational product or foundational technology that was part of that. Do you want to talk a little bit about how the Cloud Native Computing Foundation started and how Kubernetes fits into all of that?

[0:17:15] BE: Yes. So, I actually kind of missed this time. But obviously, I've been involved throughout. I've talked to a lot of people about what happened here. To give a little bit of context, just to frame where I'm coming from, I worked on Kubernetes as a Summer of Code project, the first time Google has a program where they have students work on open source projects over the summer, under a mentor. And usually, that's just open-source projects that Google wants to support, and the mentors are coming from somewhere else. But with Kubernetes being early, wound up being a project heavily backed by Google and actually worked with a Googler on it. Then, I finished school and I've since been working on Kubernetes in some form full-time. But that time window where the CNCF happened, I was back in school, I've come back.

So, my retrospective on this then, is that the CNCF People involved in Kubernetes, launching Kubernetes early on, they wanted to make sure that it got moved out to a vendor-neutral foundation. When I first contributed, it was actually under Google Cloud Platform, GitHub org, and that wasn't the right place for underscoring. This is about APIs that are independent of any particular vendor. So, they were looking for a home for it and folks didn't feel like any of the existing groups weren't quite the right fit and wanted a space to host it that was centered around what would become the space in this ecosystem.

So, the CNCF provides space for take what you will have the - I think people have different definitions, but to me, they're about in a way, the space around Kubernetes, it's about having your applications built like they're going to run on cloud and scale and so on. But not necessarily actually on Cloud. It might be your on-premise, but it's about that approach to how you run things and the whole ecosystem of open-source tools there.

[0:19:02] LA: I find it interesting that the cloud-native term, if you talk about it, if you ask anyone what it means to be cloud-native, and especially if you ask them through CNCF, the definitions you get never include running in the cloud. They always include methodologies and philosophies and structures and techniques, and that doesn't necessarily talk directly about being in the cloud.

[0:19:24] BE: Right. I think it's more about approaches that came out from the hyperscaling that you see in cloud. But for a lot of users, they're not necessarily actually going to hyperscale, and they're not necessarily leveraging cloud vendors. But we can share the approaches to how you run applications in this very API and driven way and in this way where you're not thinking about individual machines as much as possible.

Another way that I've heard people put it for Kubernetes itself is that it's like the operating system for the data center. I think that's another good way of putting it. Cloud-native is about thinking about things in that way. And they don't necessarily involve, a lot of it does. But just the way that you approach things, that you should take the approaches that were developed against cloud. But really, they're useful all the way down to in an individual edge installation to your small or large on-prem and up the cloud.

[0:20:26] LA: Yes. I always talk about the dynamic infrastructure is the API-driven infrastructure. That's really, what we got from the cloud was that, but like you say, it doesn't take a cloud to make that happen.

[0:20:37] BE: A cloud is someone's data center, right?

[0:20:40] LA: Yes, exactly.

[0:20:42] BE: So, if you're sort of running a small one for yourself.

[0:20:45] LA: Yes. Other people ask me, what's the difference between private and public cloud? And I said, the broadness of the applications running on it, and the broadness the applications running is the only difference. That is important when it comes to availability, instant availability of resources, and things like that. It's got value in those areas. But other than that, the actual mechanisms are exactly the same. It's no different.

[0:21:12] BE: Yes. In some cases, we see people - it's not necessary dichotomy. We see some people that are using both. One of the most memorable demos at KubeCon, the CNCF's Conference for me involve CERN showing, running some huge scale data compute with the large hadron collider data from their data centers that they operate, and then expanding out, and spinning up temporarily huge amount of cloud resources, running the exact same workload.

[0:21:41] LA: Yes. So, what are some of the alternatives to what the CNCF does? It seems you have a very specific philosophy of how you do things. What other alternatives are there? And again, why is CNCF, why is the cloud native model, so preferred, or so popular? Choose whatever adjective you want there.

[0:22:02] BE: I think it kind of depends on what you're actually trying to do. But there's obviously lots of vendor-provided solutions, and I think there's been various other projects that are - it's not like Kubernetes is the first thing anybody's ever built for running infrastructure that runs on cloud and on prem. So, there's a lot of projects that exists. I probably haven't even counted all of them. I think that, like I said, isn't very complicated how we got here. But I think if you look at it today, that CNCF in the Kubernetes space gives you a really energetic standard ecosystem to build around that you can use. But for certain things, you might find that the scope of what you're doing is small, and there is some maybe more niche solution that targets your needs. Or maybe you're still happy running, something that you were using before all this.

I never like to try to tell people like, "Oh, this is a silver bullet and everybody needs to use this tool for this thing." I think that one of the really awesome things about open source is that we have such a low barrier to entry to try out these things. You can spin up something on your laptop and try it out. You don't have to pay anybody anything and see if it looks like a fit for you. If it's not, then, I mean, hopefully we get some feedback from you about what it is that we're not solving well enough yet. But we know that we're not solving everything perfect for everyone.

So, like I said, even Kubernetes itself, when it comes down to it, sometimes we have tradeoffs between the amount of resources we have from people who are going to run app-like applications that the project itself uses and it just makes more sense to use a managed offering. Sometimes Kubernetes has tools to help build around those common things too. I believe, there's open source projects out there for things like I want an S3 compatible API, and I'm going to run it myself, and then now you can think about, like that itself being some common API between on-prem and cloud. But that didn't have anything to do with us and was built out by Amazon in that case.

[0:24:08] LA: Right. So, you used a term that Kubernetes is often called the OS for the data center. I've heard that too and that makes a lot of sense. I've also heard it being called the standard API for application infrastructure. There's value that comes from Kubernetes from the standpoint of cloud agnostic, right? You can build applications and the infrastructure that works with applications by having less knowledge about the specific infrastructure you're running on, whether it's on-premise, whether it's cloud, whether it's whatever. That's really helped promote the ability to make applications hybrid cloud or poly cloud or any of the other different models available there that has made that feasible and viable. Is that a fair statement and what would you add to that?

[0:25:00] BE: I think that's fair. I would probably point out that anytime somebody says, "Oh, this is the standard", well, of course, someone heard that and just started a new standard. But I think, I have been saying, I think it's useful that it has become such a big standard, even if it isn't perfect for you. It's a place where you're going to find offerings with pretty much any infrastructure vendor one way or another, and you're going to find integrations with all these things. We have a place where people can come together and help us evolve the APIs, and there's room to - one of the interesting things about Kubernetes is it's not just itself the standard API. It's also not just a platform for building platforms and a set of standard APIs. It's also a set of ways to build extension APIs.

A good chunk of the ecosystem is projects like Istio, where they're creating more APIs using Kubernetes extension mechanisms to create standards for things like, here's how you can do service mesh across clusters and vendors. So, Kubernetes itself is also guilty of creating a space for more API standards to build. I think that's useful. But having one that's really popular and has a lot of tooling around it means that - for example, maybe you actually really know a lot about the cloud you're running on. And yet, you still might want to use Kubernetes, even aside of any of the other pros and cons, just because of all of the things that people have already built for you around it.

[0:26:32] LA: Yes. You're right, there's value to using a platform that's popular, even that's not a perfect fit for what you're doing, because of the ecosystem that it provides. That's what you're saying, is it?

[0:26:43] BE: Yes. I mean, it's really hard to be a perfect fit for everyone. But being a good starting place for everyone and having room for people to customize what they need, or to come back to the project and say, "Hey, this isn't working, because it's missing this functionality", and getting that signal from everyone, and then kind of building out. Okay, now we have a standard API that covers a bit more. The momentum is really useful, I think, and that's kind of the power of open source, in general.

[0:27:11] LA: So now, you're a member of the Steering Committee for Kubernetes, but you're also a senior engineer at Google. How much involvement does Google have in the day-to-day operation of Kubernetes and CNCF in general? And how much of it is very much a hands-off? I'm talking about how much outside of Google actually goes into the standards?

[0:27:35] BE: Oh, yes. So, at the CNCF level itself, there's a lot of companies and folks that have representation in the meetings for running that, but it's its own independent foundation, under the Linux Foundation with their own leadership and things. They obviously take input from all of the many vendors that are funding them, not just Google. But they're still running their own foundation with their own governance for things like, we have the technical oversight committee at the CNCF level with representatives from all over who are determining when a project makes sense for inclusion into the organization. Then, as for Kubernetes, itself, I mean, again, Google is a major contributor. I've gotten to spend almost six and a half years now contributing to the project and one way or another, as an engineer at Google. But it's a multi-company project, and there are people from all over.

Another important thing is that, they're not all just like big company Fu has people working on it. There are a lot of independent contributors, people who are interested and want to participate in open source. People who do contracting in the space and are contributing their feedback and working on the project. So, it's a really big space and a really big project, and there's a lot of contributors from all over.

[0:29:00] LA: So, let's talk about modern applications in general. What's next in the development of modern app infrastructures beyond Kubernetes? What's the next big thing that's coming up, or it's already there?

[0:29:17] BE: It's always hard to predict the future. But I think we're going to continue to see more work and more projects and more ideas around how we provide the open toolkits for extending to other APIs. Projects like a cross-plane, or for projects for how you build your paths on top. A lot of people don't really want to use Kubernetes. It's not quite the right abstraction level for them, and it can be frustrating to be thrown at using these APIs that are like a lower level than you're thinking about. So, I think there'll continue to be a lot of work on how can we build out projects like native to native that provide a bit higher level abstraction for running your workloads.

Also, having just come from KubeCon, I mean, there's a lot of money, excitement, energy around all of the AI, ML workloads people are trying to run. It's not something you can ignore. I think the project has already been evolving in a direction to expand and improve APIs for things like integrating accelerators, things like graphics cards for your workloads. I think we're going to see a lot more around that space. Probably, projects for serving inference workloads is going to be a really popular one.

[0:30:44] LA: So, when I - the people I talk to and the problems they worry about in application modernization, it seems like there's four main things that people talk about. They talk about security. They talked about AI. They talk about data and the complexities of data management. And they talk about scaling. Now, Kubernetes really, is a tool that helps a lot with the scaling aspect of things. Either what does it do, or more generally, how does the CNCF help fit with these other three things? Security, talks a little bit about AI, but data management.

[0:31:17] BE: Yes. So, I think the compute part is something that we've really got in Kubernetes, and that networking storage and accelerators have had, and security controls have had a lot of API evolution and things. Some of these are subprojects when they org or external projects, things like gateway API for building the future of how you manage networking. But for the AI, ML, for the security, for the storage, some of those things wind up being part of the Kubernetes umbrella, and some of them wind up being under the CNCF and they're all being supported. There's people working on various approaches to standardizing storage. And we have a bunch of groups doing security work.

The Kubernetes project itself has a special interest group for security and also the security response committee. So, we have things like a bug bounty funded through the CNCF and their partners for the project itself. But we also have reports about where there are foot guns and talks about how we can improve that and what folks should do. For AI, ML, I think we've had some projects in the CNCF space, the broader ecosystem, like Kube Flow. But I think that if we go and look at the technical oversight committees backlog right now, there's probably at least a few projects looking to join in that space, and there'll be more to come there.

[0:32:48] LA: Right. Yes, I fully expect that to be a growth area for CNCF, as short to medium term anyway. So, AI will be big. What else will be big for CNCF, besides those things I mentioned?

[0:33:01] BE: I mean, I do think that some of the early talk on like the storage, the networking, and computer things that we're still evolving. We just had gateway API go GA and it covers a lot of the things that Kubernetes already had APIs for to some extent, but people were looking for how can we do better. For folks who are listening, hopefully, that have that have used Kubernetes, they'll know that the ingress API in Kubernetes is a bit under-specified and doesn't pan out as portably as you would actually hope. So, we've had a lot of energy behind, okay, we need to have common networking APIs that cover more of this, and are better standardized. There's been a lot of work on that. Storage that's been getting more mature for a while now with the transition towards CSI and having a standard way to implement drivers and storage interfaces, and to cover more than what again, more than what Kubernetes had built in.

The storage APIs we had were pretty much like, here's how you can connect up having some storage, but operations like snapshots for backups just weren't in scope. But with these newer APIs that we've been evolving, like CSI does have operation. It does have, managing those sorts of things built in and expanding the standardization and API coverage. I think there's going to be a lot of discussions about how we continue to do that and leverage Kubernetes extension mechanisms to build more common APIs.

[0:34:37] LA: So, that's where we're going, or where it looks like we're going. Where do you want to see it go? You personally. Now, not you Google, you Kubernetes, or you CNCF, but you?

[0:34:49] BE: Well, one of the things I've been really focused on is sustainability. I think one of the things is, as a project matures and is more widely accepted, and the core has most to what people are looking for, and things are being expended on the core. It can get more boring to folks and it gets a little bit trickier to staff and fund. When you're coming down a little bit from the peak of really fast development in the core project, making sure that we adapt to that, and that things keep moving forward, that the right balance between keeping the system stable, and reliable for people without big surprises or changes while still moving forward.

I mean, security is one of those tricky ones, where if we had kind of a bad default from a security perspective, we also have a commitment to stability. I think there's going to be more conversations about how we can continue to move forward on these things. I want to see a sustainable space. I also want to see more of how we can look at all this work, people are doing for all these extension APIs and things, and how we can bring some of it back and say, "This should be a standard API."

I talked a little bit earlier about we have this conformance program. So, if you want to be a certified conformant vendor, you have to run some tests that we maintain, and make sure that your cluster matches all the expected behavior, and has all these APIs. But that only applies for core built-in APIs. I'm interested in what efforts we can do to expand the common expectations, anyhow, as we've continued to iterate on things like storage and network APIs. I'm really excited about some of the work we're doing that seems a little more boring, but it's trying to take some of the rough edges off.

A couple of my teammates have been contributing on an effort to use a language called SEL to configure validation rules for your APIs to take away for a lot of users, the foot gun of running a blocking webhook on your API, where you have the choice between the webhook doesn't block and so things that are invalid can get in, or you can configure it blocking. But if you have a bug in that webhook and it goes down, then you take down your whole API, because it's waiting for the response from that webhook. Instead giving people a safer, more scoped built-in way, to specify simple rules about what is valid. I think there's a lot of work like that, that doesn't sound as exciting, but will just help take away some of the rough edges and make things easier to use. I hope that a lot more of that continues. I hear people about how hard it can be to run and use it, depending on what you're trying to do, and I want to keep seeing us make that easier to do.

[0:37:42] LA: Yes. Like you say, arguably, if you're building a large complex application built with thousands of containers, there's nothing paired with Kubernetes to make it work. But if you have a small system, home systems, very small number of containers involved. Kubernetes is overkill. To me, it would be great to make Kubernetes less and less than overkill for those smaller environments. And that sounds like one of the things you're talking about.

[0:38:09] BE: Yes, I mean, in some ways, being able to have common APIs for all these things people are looking for, and build an extensible system means that it's always going to be complicated. But there's ways that we can take some of the sharp edges off, or we can improve upgrades, or just make it easier for people to use. A lot of what's happening right now, I think as people are looking to differentiate in the space. But I want to help continue to drive - we also have to drive for the core and make sure that some of these improvements and better ways of doing things, make them back into the project and are available to everyone.

[0:38:46] LA: Yes. Well, thank you. This has been a great conversation, Ben. I very much appreciate your time and energy on this.

[0:38:52] BE: Thanks for having me. I mean, this is what I'm really passionate about. Like I said, I started working on this as a college student, just contributing to open source, and I've come back and I spent all my time around this. The community is really important to me, and I've really enjoyed getting to work in this space. I think it's something special.

[0:39:08] LA: Great. We look forward to hearing and talking to you more, and maybe you come back again someday. Ben Elder is a senior engineer at Google working on Kubernetes, and he's also an elected member of the Kubernetes Steering Committee, and he's been my guest today. Ben, thank you for joining me on Software Engineering Daily.

[0:39:26] BE: Thank you.

[END]