EPISODE 1649

[INTRODUCTION]

[0:00:00] ANNOUNCER: Waymo is an autonomous driving company that had its start as the Google self-driving car project. The company has logged 7.1 million driverless miles and recently announced it will expand service to Phoenix, Arizona. David Margines is the Director of Product Management at Waymo and he joins the podcast to talk about Waymo today, the sensing technologies underpinning their cars, the huge impact of AI on their system in recent years, and more.

This episode is hosted by Tyson Kunovsky. Tyson is the Co-Founder and CEO of AutoCloud, an infrastructure as code platform. He is originally from South Africa and has a background in software engineering and cloud development. When he's not busy designing new GitOps workflows, he enjoys skiing, riding motorcycles, and reading sci-fi books. Check the show notes for more information on Tyson's work and where to find him.

[INTERVIEW]

[0:01:03] TK: David, you're currently a Director of Product Management at Waymo, where you're working to build a fleet of self-driving vehicles. On the day-to-day, you and your team are working on problems across multiple domains, like perception, mapping, and commercial data infrastructure. Tell us a little bit about your technical background. How did you get started in tech and how did you get to this point in your career?

[0:01:25] DM: Yeah, thanks for that. I looked at some of the other guests that you've had on your podcast and I think I'm a bit of an anomalous guest in that regard. I don't have a formal background in computer science. I was actually going to be a lawyer. I went to Yale for undergrad. My major was called ethics, politics, and economics. But along the way, I got interested in technology. I was the founding member of a startup that built software and solutions in the telecommunication space. As I worked with customers to understand their needs, I realized that what I really loved was the technical aspects of the job.

That company was eventually part of an acquisition by SoftBank. I did a few more startups, but really wanted to have the impact that a much larger company offered. I found my way to Google, where I worked on automation products in the ads product area. Then, eventually, I was just really attracted to the mission that Waymo was pursuing, and so found my way there. 

Once I got to Waymo, just continuing on the technical background, I found myself surrounded by PMs who were even more technical than the PMs that I work with at Google. A lot of them became role models for me in terms of my own technical background. I finally made up for that lack of formal technical training by working with them and my engineering counterparts, dedicating myself to learning and playing catch up. Now, I have made up for that lack of formal technical training and lead a few technical spaces as a result. I'm happy to tell that story, given that I don't think it's held me back at all.

[0:02:52] TK: Awesome. Thanks so much for sharing that. I've had the opportunity a few times now to ride in an AV. It feels like this absolutely magical experience that doesn't even feel real. I'm sure it's been quite a journey to get to the place that Waymo is at today. I was wondering if you could give us a brief history of how the quest to create AVs began and the current state of AVs at Waymo today.

[0:03:16] DM: Yeah, absolutely. Starting in 2009, Google formed the Google Self-Driving Car Project. The history actually goes back even before that. Some of our early employees even go back to things like the DARPA grand challenge, which was a challenge to develop automated vehicles that worked in a variety of environments. The Google Self-Driving Car Project continued from 2009, until it was spun off from Google as Waymo and became an Alphabet subsidiary in 2016.

We've hit a couple of pretty incredible milestones along the way. 2015, we had the first fully autonomous ride with a member of the public on public roads. That was in Austin, Texas. In 2017, Waymo became the first company to put a fully autonomous fleet on U.S. roads, without a human behind the wheel. 2020, we launched the first fully autonomous public commercial service in Phoenix. Three or four years, we've been operating and continuously growing our service in Phoenix. For that period, anyone in Metro Phoenix could hail a fully autonomous ride with our Waymo One service with no waitlist.

We serve the 220-square-mile service area there. It's about four or five times larger than San Francisco. It includes downtown Phoenix, Sky Harbor International Airport, and many neighborhoods across the Metro Phoenix area. For people who live in Phoenix, autonomous vehicles have been a reality for some time.

Then, we go to San Francisco and the Bay Area. That's where our history began. It was the second city in which we deployed. Similar to Phoenix, members of the public can also hail our vehicles and go all around the city and county of San Francisco. Across those two cities, we are driving tens of thousands of paid rides to members of the public every single week. Beyond that, we are driving autonomously at small scale in Los Angeles. We have the Waymo tour that goes to various parts of the city and offers people the ability to ride for a short period to introduce the public to the service. Then we have previously announced that Austin is a city where we are planning to deploy next.

All in all, a huge amount of progress over the last 15 years or so, since we were first founded. Along the way, we've developed safety methodologies to evaluate our system performance. Before taking any incremental step, like the steps that I just described, to expand the use of our service, we perform a rigorous review of safety readiness and make sure that the software is fully validated for the deployment that it's going to be in.

[0:05:41] TK: It sounds like, AV technology has come a long way in a relatively short amount of time. I understand that there are five levels of autonomous vehicles. For those that don't know, can you briefly explain what these five levels of AVs are and maybe, also, give some examples of different companies doing work with them?

[0:05:59] DM: Yeah, absolutely. These levels are defined by the Society for Automotive Engineers. Five levels of automation. You could have six, if you start with zero, of no automation. That goes all the way up to level five, which is fully autonomous operation everywhere. Along the way, I think one level that most people would be familiar with would be level two, which is advanced driving assistance. These are things like, lane keep assist and things like that. Waymo is a level four driver by the SAE terms, which means that no human is required to perform the driving function.

There's a lot of, I think, nuance to these terms. We like to think about it in a much simpler and more binary way. You either do need to have a human with a driver's license who's awake and sober and ready to take over behind the wheel, or you don't. Basically, everything L3 and below, you need to have that human who's ready to take over. With L4 and above, you don't need to have that. Waymo is really just focused on L4 and above, because we think that that is the safest way to get people and goods from point A to point B, whatever the weather, traffic conditions, outside environment, etc., today. You can't buy fully autonomous vehicles today, but you can certainly ride in one with Waymo.

[0:07:16] TK:  To build an AV, there are so many different domain spaces and teams. You have teams like Perception, which you lead that rely on sensors, like radar, LiDAR, ultrasonics, and others to interpret the world around the vehicle in tens of millisecond ticks, and then use that data with their ML models to paint a picture of what's happening. You have teams like Behavior that need to figure out how to codify specific behaviors into the vehicle systems, given certain circumstances. Then the other team’s responsible for machine learning, mapping, and data infrastructure, not to mention all the other disciplines. Can you give us a rundown of the different domain spaces within the overall AV software and hardware space?

[0:07:56] DM: Yeah, absolutely. Maybe I'll bucket this into a couple of different groups. Let me first start with what you need to operate fully autonomously. First, you need a vehicle platform. We have a vehicle team that works with partners to define the platforms that we're going to operate on. On top of that vehicle, you need your sensor suite, and that requires hardware in the form of sensors. You mentioned some of them, LiDAR, radar, cameras, etc.

Interpreting the outputs from those sensors would be your Perception system, which takes those outputs and it paints a picture of the environment, so that we understand the world around us. We also have, you mentioned, mapping. We have a mapping team that logs perception data and then turns that into a prior understanding the world that other vehicles can use to give them the benefit of having that understanding.

The sensing hardware has to connect to a centralized computer within the vehicle. We have just an amazing team that designs the computer within the car. Compute performance and reliability is really just mind-boggling, and we have them to thank for that. You need to log data, you need to be able to communicate back and forth from the vehicle into the cloud, so there's telematics, etc.

Then rounding out the onboard picture, you need a planning team that predicts the behavior of all of the objects on agents that our perception system sees, and then plans a path to safely make progress through that environment, pick up, drop off passengers, etc. Then once you've planned the path that you're going to take, you have a motion control team that executes that plan and actually, it's the vehicle's hardware, so the steering wheel, brakes, gas, etc. That was bucket one.

Now, you have the driver, but you need to evaluate its performance. There's a large and critical group of organizations that's tasked with doing just that. I'm not going to be able to hit them all, but maybe just to name a few. One would be simulation. That helps us understand how events from the real world would play out if we changed inputs in our software, or lets us generate fully synthetic ones. We could change inputs in our software, in the environment, in other AV’s behavior, etc. We have a systems engineering team that validates and verifies our performance across a variety of methodologies. We have the safety research team. It helps us define, what does it mean to drive safely, developing metrics against safe driving, establishing thresholds, and things like that.

Data science, data analyst teams, they help paint a clear picture of our performance, and infrastructure teams that empower our engineers to build and deploy software at scale. Now you've got a driver, you have the ability to measure its performance. Then thirdly, none of this would be possible without the operations and other orgs, which keep our cars in the field, with the right software, operate all of our software validation missions, support our riders, etc.

It is a very diverse and large organization that all has to own its piece of the picture, and all has to successfully develop in concert with all those other teams, so we can continue to develop and hill climb on our performance as we go.

[0:10:44] TK: Given what you've just said, this sounds like one of the world's most complex problems to project manage, because there are just so many different teams and requirements for everything, from hardware, to software and operations to ML models that depend on other upstream ML models. Your teams are so diverse in their domains, but creating an AV requires all this cross-functional collaboration. From a product perspective, how do you get everyone to come together and collaborate successfully?

[0:11:11] DM: Yeah, it's a massive challenge, and it's a great question. I would say that at the core, it starts with a clear vision. Storytelling is really important. Painting a picture of the future that we want to see, whether that's at an extremely broad level, answering questions like, what will Waymo look like at the end of 2024? Or at extremely granular levels about how do we want to be able to take turns of this style. Whatever the story breadth is, we tell that story, and we paint that picture in a very clear and a compelling way. That story then gets cascaded out to increasing levels of granularity depending on the teams that need to work on it.

To give a concrete example, maybe, at the end of 2022, we knew that we wanted to set very ambitious goals in terms of operating in the weather in the cities that we are in. What does that take? To give some examples, it takes a deep understanding of the weather in our cities. What does the environment actually look like? How do we describe that and quantify it? It takes sensing that can see through rain. It takes hardware that can keep sensors clean. It takes software filtering that isn't adversely affected by maybe droplets up to a certain size. It takes behavior prediction that understands how other drivers react to weather, maybe new prediction model inputs for pedestrians, etc.

It also takes enough data, gathering enough data to validate that we're good enough in those types of weather and to prove it to ourselves. Each team needs to internalize that bigger picture, that broad story, and then tell the story back for their piece of the puzzle. “Here's what we need to do in order to be able to accomplish that larger picture.” Now, you've got this broad story. You've cascaded out to making up numbers, maybe 15 teams that need to work on it. Maybe for 13 out of the 15 teams, that work is tractable. They look at it, and they look at the path ahead what they need to accomplish, and they decide that they're able to achieve it.

Maybe for some of them, there isn't a clear path. You have to remember that what we're working on is absolutely pushing the limits of robotics today. We can't look at a recipe and just follow it step by step. We all have to be inventors. We need to ensure that we bring forth the diversity of thought from many, many different people with many different backgrounds to ensure that we're starting from the most complete picture we can. We start with intuition about the world, with healthy skepticism about our own observations, and we ensure that we're bringing to bear a wide range of those observations. Then we collect data. We try to prove hypotheses, right or wrong.

Collaboration, as you pointed out earlier, across these teams is a big challenge. But the most important thing, I think, is figuring out who's accountable for what. One of my favorite things about Waymo is how motivated everyone is to make autonomous vehicles a reality. Safety is urgent for all of us, and that means that accepting the accountability for a solution in this space is a real responsibility. People take it seriously. Once you figure out who's accountable, they're empowered to develop a plan, to coordinate with different teams to execute on it. Or, if they see something that's standing in their way, they're accountable to raise their hand and say, “We need help.” Then we need to rally around that team and figure out how to help them, so that we can all draw along the solution and make sure that we as a team are successful. Hopefully, that paints a little bit clearer picture of what collaboration looks like, and how we keep everybody marching to the beat of the same drum.

[0:14:22] TK: The amount of collaboration, intentionality, and thought that goes into making all this work is just staggering. As you mentioned, you personally work across multiple domains, like perception, mapping, commercial data. I was wondering, can you give us an example of a feature that you've built that cuts across all the different domains that you work on?

[0:14:43] DM: Yeah, that's a really good question. I have a good example. There's a feature in particular that I really love, which is that as Waymo vehicles drive around, we are measuring and recording the duration of the yellow lights in intersections that we see. Doing this isn't something that's required. If you dig into this at all, you'll see that there's a model that's published. I think it's published by NHTSA, the National Highway Transportation Safety Agency, which provides guidance to traffic engineers who are programming these lights real life, and it's based on speed, maybe the geometry of the intersection, etc.

What we found is that the model isn't always followed. For one reason or another, sometimes yellow light durations are longer, or shorter than the model suggests that should be. We always have the model that we can use, but until we have a good understanding of the real value. But once we do have an understanding of what the real duration is, that can help us make the difference between deciding to go, or deciding to stop when the light turns yellow. That's an input that we get from perception, that we log, and then we encode into our maps, and then our planning system can reason about it.

So, that when we are driving towards an intersection and the light turns yellow, we can make a much more accurate prediction of when the light is going to turn red, and therefore, whether we should stop sooner, if the duration is going to be short, so that we don't have a red light violation, or that we can continue through and have a smoother ride if it's going to be up longer. I think that's just a great example of how different parts of our system work together to produce a solution that's ultimately better for our users and for the safety of our system.

[0:16:16] TK: Makes sense. David, let's dive in and talk about software in a little bit more depth. Talk to me a little bit about the software development lifecycle process for, say, the perception team. How do folks build, test, and release, and what does that overall process look like?

[0:16:32] DM: It's a pretty amazing process, just given the complexity that we talked about a little bit earlier. Frankly, it's quite amazing to watch and participate in. Frankly, a privilege to be a part of it. Our updates come out like clockwork, maybe not on a second, kind of tick, tick, tick. But certainly, we have a release cycle that we try to stick to very diligently. We have a whole team that works on just improving that clock over time and making sure that we can bring out updates ever more rapidly. Each update undergoes rigorous testing at the unit level. If you're developing new code, you have the ability to test that across highly nuanced and specific set of scenarios, and it also gets tested as it gets incorporated into the main branch as we test the Waymo drivers, capabilities and performance overall.

That takes place and unlocks new features and performance improvements for the driver across our entire fleet, from better parking, or multi-point turns, to handling heavier levels of fog and rain. We've spent years refining a comprehensive set of methodologies to assess safety across our technology and operations, and ultimately, guide the deployment and safe operations of the Waymo driver. We call that our safety framework. 

We actually released it. It's available on our website, waymo.com/safety. We released about three years ago, about the same time that we opened up our fully autonomous service to the public in Phoenix. We did that, so that the communities that we serve could learn more about our deployment processes and how we validate our software and make sure that we have gone through the diligence that is required and expected before we put our software on public roads.

There's a huge range of methodologies that go into that. Some of the examples would be the robustness of our hardware, the compute pieces of hardware, as well as our sensor stack, the driver's behavior, as well as our back-end operations. Not only do we validate it in simulation, but we also put the driver through rigorous testing that includes real-world driving, as well as closed-course testing.

[0:18:33] TK: I want to talk more about safety, testing, and simulation. Before we get there, and I know this is a huge question, and obviously, we don't need to talk about anything proprietary to Waymo, but in general, what does the overall AV tech stack look like? What sorts of languages, frameworks, orchestrated machine learning workflows, and technologies go into making self-driving vehicles a reality?

[0:18:56] DM: A lot of this is special sauce, and maybe I'll try to answer in terms of generalities, but also, give a couple of examples. I think that Waymo has always been at the cutting-edge of ML, and especially in an applied manner. AV technology, and I would say, the Waymo driver in particular, is probably the most mature manifestation of AI in the physical world today. I don't think that that's an overstatement.

AI is really what made our recent progress possible. I think the most exciting thing about it is that our driver is truly generalizing. Not just having specific inputs for specific situations, but really, truly learning how to drive in the way that we think about ourselves learning how to drive. Waymo was among the first companies to apply ML into our software stack, in particular, that was made possible through a collaboration with Google in the early days of Waymo. AI has long been a part of our stack, but its role has really grown massively in recent years as the field has made progress on some of these fundamental challenges.

In essence, what we're doing is we have a stack that is ML primary, and it learns from the tens of million miles that we drive, both by taking into account evaluation of our own behavior, but also, observing the behavior of drivers in the world, as well as other road users, like pedestrians, bicyclists, etc. Also, from the outcomes of tens of millions of miles of simulated miles, and we leverage robust ML techniques for this.

I think, one of the examples that I really like for explaining how this might look is, anyone who's been at San Francisco and knows anything about it will know that it is an environment that has a lot of very steep hills. You can't just, or I would say, it is extremely challenging to just program a model that allows you to know how hard to press the gas pedal when you are going up a hill, or how hard to press the brake, or what that braking profile would look like going down a hill.

Instead, our ML models have been trained to observe and learn many of the nuances, either through our autonomous specialists who are driving vehicles around, or just by learning over time and being given some inputs about what that profile of acceleration of braking should generally look like. Through our experience in driving in San Francisco, when a driver has learned overall that how people drive, how our autonomous specialists have driven, and how it should drive.

As a result, we've been able to develop a model that while driving up or down steep hills feels, in my opinion, in the opinion, I think, of many of our users, much more confident and smooth than even many of the human drivers there. I think that's a great example of something that's extremely hard to give heuristic inputs about how to be successful in. But once you've built the right model and trained it with the right inputs, the result becomes very natural, and it's something that it's able to develop for itself over time.

[0:21:47] TK: I can definitely attest to that. I was pleasantly surprised by the acceleration profiles going up and down the hills in San Francisco when I rode in one of your cars there. I guess, I have the same question for you, but on the hardware side of things. There's a significant amount of hardware that goes into making an AV function, everything from CPUs and GPUs to a wide rate of sensors to gather all the data that a vehicle needs, without getting into anything proprietary in general. What are some of the most interesting pieces of hardware that you work with?

[0:22:17] DM: Yeah, absolutely. I'll talk maybe a little bit about our sensing suite, which is, I think, some of the coolest hardware around, and certainly, I love how we apply it as well. For perception, there are three main groups of visual sensors; LiDAR, camera, and radar. Each sensor complements the others. They provide our vehicles with a very rich view of the world that makes our cars safer and more capable, including, and especially in conditions that are anything but clear and sunny day, so things like, driving at night, tumultuous weather, maybe when you're driving directly into the sun, being able to leverage each of those sensors when they are at their best makes us a safer driver overall.

The integrated sensor system, it's designed to see up to three football fields away, maybe in metric, three soccer pitches away, or let's just say, several hundred meters. They each have their own capabilities. LiDAR, for instance, can see shapes in three dimensions, can detect stationary objects, precisely measure distance, and it's an active sensor, so it brings its own light source. It isn't affected by things like the dark. Cameras are a passive sensor, so they rely on external light, and they excel at detecting detail, observing colors, reading text, or seeing very, very far away. Radars have their own strengths. They can track both static and moving objects, and they're highly effective in inclement weather. That was the visual sensors.

Beyond that, we also have an array of microphones on the vehicle that are designed to sense audio, like sirens. Like our ears, they can localize the sound. They can figure out whether it's approaching or receding, estimate, directionality, how far away it is, etc. That becomes tremendously important, especially in urban canyons, where maybe you hear a siren long before you can actually see it and make decisions about how to react to that, so you can avoid any interactions with emergency vehicles.

That all gets married together through our custom-designed compute. That's tightly integrated with our sensors and vehicles, and designed for reliability and performance. It enables the processing of this very large and diverse set of sensor data, and allows us to make real-time driving decisions with the extremely low latency that you would need for driving, especially on high-speed roads, or on freeways.

With each generation, we've been able to drive down the cost of our hardware, which is very exciting, because you can imagine that with all of these sensors, things could get costly. At the same time, we've been able to upgrade our sensor capability and our compute power. Right now, we’re in our fifth generation of hardware, and we've been able to simplify the design and manufacturing process, so that it's essential production-ready. As a result, it delivers more capabilities, performance, but at half the cost of some of the previous generations.

[0:25:04] TK: It almost feels like, you're building a space vehicle, but for land, given all the complexity and sensors and hardware and software. I'm curious to talk a little bit more about data and ML orchestration. Overall, how do data pipelines and automation work in the AV industry? You're continuously gathering this mass amount of data, so how do you figure out what to gather, and then how to ship that to the correct places and process it to get the insights that you need?

[0:25:32] DM: Yeah, absolutely. Both two complex challenges. Let me first talk about data, and then I'll talk about automation. On data, the most interesting thing I think is how our approach has changed over time. We have a couple of dozen cameras, five LiDARs on the vehicles, a number of radars, we've got our microphones, etc. As you can imagine, we are generating and logging, potentially, a tremendous amount of data.

When we were only driving a tiny amount years ago, every single byte was precious. We logged and stored everything. We had so much to work on that there were insights to be learned, even from driving that humans might consider mundane. But then over time, our driving performance improved, and the number of miles that we're driving also increased considerably. We found that collecting the right data was far more important than collecting more data. You can't just throw exabytes of spaghetti at the wall and see what sticks. You have to be a little bit more circumspect about it. 

What we found was that starting the conversations about what data we would eventually need really opened up a lot of fascinating capabilities about how to identify that data in a more real-time fashion and be a bit parsimonious about what data we would actually want to keep around. What that gives you is a path to far more efficiency, because now you're having to deal with a lot less data, but much higher quality data. I think that makes all the difference in terms of becoming a lot more efficient and being able to move much faster.

Now, you've collected the right data. How do you automate? How do you make things efficient? Waymo is a large organization, and ensuring that all of our employees, I think in particular, software engineers can work efficiently, is incredibly important for an organization of our size. If there's any waste, or tedium that they have to go through, it just slows everything down. We try to employ automation wherever we can.

I’ll talk about maybe a few different levels. At the broadest level, we utilize our simulation capabilities to automatically test new software over, and I mentioned before, billions of driving miles. Generating these pipelines and aligning on these data sets to use requires a massive investment in time and resources, but it's worth both because of the insights that it produces, but just as importantly, how efficiently it can produce them as compared to teams just combing through data bit by bit. That's the broad level.

Zooming in, one of the things that Waymo does, I think incredibly well, is define granular problem areas, incredible specificity that we try to improve on. As a hypothetical example, maybe you'd want to improve on maybe multi-point terms in narrow environments. Now, you have maybe a small, efficient, highly focused team that's working in that area, but how can you make them work even more efficiently? Well, there's opportunities for them to automate all kinds of processes that they need to work on, from identifying instances of their problem phase, to quickly visualizing hundreds of those instances at a time and picking the interesting ones that they want to work on, to re-simulating those events on new software to see how much better new code works.

These are areas that we're constantly investing in, because it helps us move faster, more efficiently, more successfully. These investments really pay off in terms of the team's throughput.

[0:28:49] TK: Amazingly complex data problem. It's really interesting to think about the intentionality that one has to have dealing with, at your point, exabytes of data, deciding what to gather, and then how to make that data available in an efficient fashion for everybody, so as not to slow them down. If we dive a little bit deeper into data for a moment, from a data science and ML perspective, what kinds of insights do you look to gather from your data, and then, how do you subsequently use those insights effectively? Perhaps, even a real-world example here from perception might be insightful.

[0:29:22] DM: Yeah. There's no end to the variety of insights that we can gather from our data. One of the things I really, really love about Waymo, especially now that we're driving so much is the huge variety of situations that the world presents to us, and then thinking about how we can improve the Waymo driver to interact with, or avoid them. Maybe I'll try to tackle the insights question at three different levels, that'll give you a flavor of how we think about this.

Maybe starting at the most granular level. The first level might be insights, situations that are extremely long-tail, or rare situations. Sadly, some of these long-tail situations are collisions that humans have caused, which are cars have witnessed. They took place maybe in the same intersection, or near us. Others are just wild, or reckless behavior. A specific example would be years ago. For instance, I remember we saw a driver who was traveling 65 miles an hour above the speed limit in a 35-mile-an-hour zone. He was going 100 miles an hour in a 35-mile-an-hour zone. That's just insane, right? It sparked a lot of interesting conversations about how to incorporate that type of outlier data into our models, right? Specifically, about behavior prediction, right?

Typically, when you're driving at a 35-mile-an-hour road, you don't think that you're going to have 100-mile-an-hour speeders. When you do, you need to be able to drive safely around them. Maybe zooming out a bit, the second level of insights that we look for are when we can start quantifying data. In particular, developing nomenclature and terminology for situations. One example that I love is the challenge of describing intersections. We all know what an intersection is. We know that there's a lot of variety of them, but creating a shared language for that variety becomes a breakthrough in and of itself.

One methodology that one of our engineers developed is giving each intersection a fingerprint, based on the number of lanes and directions the intersection has. Essentially, becomes a hash. Now, all of a sudden, you can group and analyze data with a lot more density and richness, which was each intersection was previously unique. Now, you can group this data, right? You can analyze it as a group. If you think about that from an ML perspective, as you decide the inputs that you would want to use in a model, you want to have granularity where you need it, but you want to have richness where you can. Creating nomenclatures and ways to quantify and group these things can be an insight in and of itself.

Then zooming out a little bit further, maybe the final level is the set of insights that you'd use to describe driving performance. As I alluded to earlier, the variety here is really endless. But the important point, and I think this insight goes beyond just the AV space. I think, this is also broadly applicable. The important point is picking metrics that provide one, or all of the following three characteristics. One would be discovery, right? Is the metric helping you find situations that are novel, or interesting that you would want to work on, or improve on? Two, improvement. Is the metric letting you track improvement in driving performance, or system level performance over time? Three thresholds. Is the metric telling you that your performance in a particular area meets some threshold, or goal that you've set?

I think with bringing those three things together, you can decide whether the insight that you are developing is worth tracking, is worth developing against, and worth developing an ML model to improve on over time.

[0:32:45] TK: The more I hear you speak, and you've previously mentioned this a couple times, the more important the simulation process becomes apparent to me. How does the simulation process work from a technical perspective? AVs sit at this really interesting intersection of ML and safety criticality. Because safety is so important, you can't just ship new features like you would with a software platform and hope for the best, right? Before new features are released to the real world, as you mentioned, they're first simulated for thousands of hours and tens of thousands of miles. What does that process look like?

[0:33:19] DM: Yeah, absolutely. I think you've hit some of the broad points on simulation. We regard it as a tremendously powerful tool to help extend our real-world driving. It plays an important role in testing new software. Outside of the testing that we do in the real world, we do a lot of testing in the virtual world. I've mentioned a number, you know, tens of billions of miles that we've driven in simulation at this point. The way that we use it is that we use it to validate new software and to also create rare scenarios to give our cars more experience.

If you think about the lifetime of human driving, a human might drive, say, 300,000 miles over their life. That's something that the Waymo driver now experiences on the order of maybe a couple of days. We're doing many human lifetimes. Situations that a person has never experienced in their life might become a little bit more routine, or need to become more routine for the Waymo driver, and simulation is the environment in which we can create that and test it.

We have a whole suite of simulation tools, various capabilities. Some allow us to test our sensors, and different sensor inputs on a very modular level, while others allow us to test the whole system and service performance end-to-end. Maybe giving a tangible example, simulation allows us to take the sensor inputs, or other inputs from our driving and change some of those inputs to see what would have happened.

Let's say that you have an event where maybe one of our vehicles missed a lane change, even though a human might have been able to make it. First, you can rerun that event from the logs that you gathered, and you can run in a simulation with no inputs changed and validate that you get the same result. But then, you can also change maybe your planner logic about how we plan and stage lane changes, or maybe your behavior prediction about the vehicles around us, which might have impacted that lane change. That lets you figure out how you can improve on that particular event.

Then to ensure that you have an overfit, you would also want to rerun maybe some new code that you've developed against that against a much broader set of lane changes and look at broader metrics to make sure that overall, for situations like this, you're making things better. You can also inject changes into the event outside of just our driving code, right? For instance, maybe you want to change a driver near us to be more aggressive, or change the weather that you're driving in and seeing what happens.

This really gives you the opportunity to test and validate the code that you're writing and get some near real-time feedback on it, while also testing and validating its performance quite robustly.

Then as part of our software validation process, which we talked about earlier, your code would be incorporated into the part of release and then tested against a very broad set of driving to make sure that overall, the new release is better than the old one.

[0:36:01] TK: David, it's been fascinating to learn about the work that you and Waymo are doing. I could keep asking questions for hours here. This is such a deep and wonderful domain to learn about, but we're coming up in the end of our time together. I have one last AV specific question for you, which is that in your mind, what are some of the biggest challenges that still remain to be solved in the overall AV space, as you strive for true level-five autonomy?

[0:36:24] DM: Yeah. We've come a long way over the past decade-plus from the research project at Google, to being the first company in the world to operate our fully autonomous ride-hailing service. You may have also seen that just yesterday, we publicly announced that we started driving fully autonomously on freeways. We're moving quickly. We're safely accelerating how we scale across multiple geographies and we've solved many of the core challenges.

Maybe to directly address your question, I think that there's an opportunity to make even faster progress. At Waymo, safety is urgent, because every human driven mile that we are replacing with a Waymo driver driven mile is a safer one, right? We are preventing safety issues on the road with each one. We want to accelerate that over time.

What are the major challenges left? I think a relatively obvious one, which I think we didn’t talk about before, is snow. We set forth to tackle many kinds of weather over the last number of years. We've been very successful in that. Snow presents new challenges, and so, I think that that is going to be a pretty fascinating one. Last year, we went to New York City to start collecting data around that. This winter, we're driving around in Buffalo to gather insights about what the challenges look like there.

Another example would be some of the more challenging driving environments that we’ll encounter when we go international. I spent time in India, say. I cannot wait for the day when the Waymo driver learns to navigate in environments like that. It will represent just a level of maturity and sophistication in our system that I think will be just truly mind blowing. That would be very exciting to tackle.

[0:37:56] TK: I was going to say, if you've ever had the pleasure of trying to cross the street in Delhi, or Hanoi, you'll know that those situations and those environments could be quite hairy. David, if folks are interested in learning more about you, or the work that Waymo's doing, what's a good way to connect?

[0:38:12] DM: Yeah, happy to. I'm reachable on LinkedIn. Feel free to shoot me a message. If you want to find out more about Waymo, our website has all kinds of great resources. Maybe I'll point your listeners to two. One is our careers page, waymo.com/careers. That's the best way to find out about opportunities to join our team. Two is our safety page at waymo.com/safety. The safety page in particular has just incredible content that describes our approaches and some of our results in huge levels of detail. I would really encourage folks who are interested to just go and take a gander around there, download some papers and see what piques your interest.

[0:38:45] TK: David, thank you so much for coming on Software Engineering Daily and talking to us about your experiences building AVs at Waymo.

[0:38:53] DM: Thanks a lot, Tyson. This has been a lot of fun.

[END]