EPISODE 1678

[INTRO]

[0:00:00] ANNOUNCER: Autonomous vehicle engineering is a huge challenge and requires the integration of many different technologies. A self-driving car needs data from multiple sensors, ML models to process that data, engineering to couple software and mechanical systems, and much more. 

Ian Williams is a Senior Staff Software Engineer at Cruise. Before that, he worked at Google, Lyft, and eBay. He joins the show to talk about the diverse engineering challenges and strategies associated with building self-driving cars.

This episode is hosted by Tyson Kunovsky. Tyson is the co-founder and CEO of AutoCloud, an infrastructure as code platform. He is originally from South Africa and has a background in software engineering and cloud development. When he's not busy designing new GitOps workflows, he enjoys skiing, riding motorcycles, and reading sci-fi books. Check the show notes for more information on Tyson's work and where to find him.

[EPISODE]

[0:01:07] TK: So, Ian, you're currently a senior staff software engineer on the perception team at Cruise, where you're helping to build out a fleet of self-driving vehicles. And prior to that, you've been an engineer for companies like Lyft and Google. Tell us a little bit about your software engineering background. How did you get started with programming? And how did you get to this point in your career?

[0:01:27] IW: It's been a little bit of a roundabout way. I guess, I'll start from the very beginning. My mom was an employee at Microsoft growing up, so I always go visit her at work, and I thought, "This is what I wanted to do." Then, I studied mathematics in university, was rejected twice from the Computer Science School, which was a great lesson and never giving up. Then I worked for a few years in eBay, and then at Google, at Lyft, and finally made my way here into Cruise.

Obviously, there will be some things that I can't get into that are proprietary. I'm here representing myself as a machine learning practitioner, and my views aren't official Cruise policy. But I'm happy to get into candid conversation about interesting topics in this technical space.

[0:02:12] TK: Thanks so much for sharing that. I guess being a software engineer on the perception team, you're probably having to solve a lot of ML-related problems. What's your ML background? What got you interested in ML at the first place? Did this start with your work at Google? I understand you're working on speech-to-text or somewhere else?

[0:02:30] IW: Yes. So, I think that studying math gave me a little bit of a background. And then a lot of coincidence, when I did my first job out of college, I worked closely with some people who are interested in machine learning and we toyed around with some ways that we could use it on problems we were solving. Back then, there were some kind of fun, small, open-source tools, something called Weka, which some of the listeners might know. It was an open-source machine learning toolkit from the University of Waikato in New Zealand.

I remember setting up a poll on Facebook, where I asked friends to rate some statuses as either good or bad. Then, I tried to make an early sentiment classifier just using goodness or badness of certain words. This was back in 2011. Then, from there, it blossomed. I worked at eBay, where we did predictive delivery estimates trying to guess when a package will arrive. Of course, eBay doesn't control shipping packages. That's the seller themselves. Then, at Google, I worked on speech recognition, as you mentioned, and it's just continued from there.

[0:03:32] TK: Can you give us a rundown of what you actually do on a day-to-day and what perception teams that companies like Cruise, Waymo, Aurora, and the others typically work on?

[0:03:41] IW: Oh, man. Well, I wear a lot of hats, so my day-to-day really depends on the day. I guess, let me start with what any perception team anywhere in the industry works on, which is ultimately using computerized binary sensor perceptions of the world to readings of the world, and trying to make some sense of it in a way that humans do. So, it's taking a very unstructured set of information, whether that's from a sensor like a LiDAR, or a radar, a camera, or what have you, and turning it into something that has some meaning in the way that humans interpret it. So, this is an object. Maybe we could try to tell you what kind of object it is. We can tell you where the barriers are. We could try to do localization to figure out where in space we are. But essentially, perception is about taking what looks like random bits of information and extracting from it some useful information.

[0:04:37] TK: From chaos to order. That makes sense. I guess the thing that I'm most curious to learn about is the end-to-end lifecycle of machine learning for perception problems. I imagine this has to start with data and you're obviously dealing with a lot of data here. If I had to guess, you're probably having to gather all kinds of data from different sensors that ticks of 10s of milliseconds and you think about logistics and costs associated with simply storing and moving all that data across a fleet of cars, it must be completely prohibitive to gather everything. So, how are you optimizing for this problem on the front end? And how are you being intentional about what you gather?

[0:05:15] IW: That's a great question and one of the many challenges that arises in the self-driving problem, and in fact, in any sort of autonomous and machine learning task. One thing I want to say upfront is that I can't say anything specific about what Cruise does, or talk about technical details that other folks wouldn't be aware of. But I will talk, of course, about industry trends and things that from a knowledgeable person, maybe the listeners wouldn't know about would find interesting.

So, there is a huge amount of data that gets generated when you have a large number of vehicles out there that have a huge number of sensors, and are producing a whole bunch of logs and analyses as they're going. If you even just think about the size of images, radar data, LiDAR data, if you think about the number of sensors on those vehicles, and you consider the frequency at which they're capturing data around them in the world, and you multiply that by the amount of time they're out on the road, and so on, you can certainly store all this data.

But actually, there's a lot of costs associated with transporting it over the wire. There's a lot of costs associated with processing it. Storing that data is not the same as having it in the format that's useful for you in all the different applications that you might run it on. You have to also consider that there are derived data formats. So, if you capture a raw camera signal, of course, this is not what most of your algorithms work on. They work on a something that extracts features from that image. The logic that is applied to this may be changing over time as you make improvements.

So, if you store data from years ago, over many different kinds of cameras, and then you want to try and use it at a future point, you're going to have to reprocess all these and reextract features, and some of these algorithms are not cheap. So, having that data is one thing, but actually getting it in the right place at the right time making it available for use, it can be quite expensive. As you said, you do need to have a solution for this on the front end. You need some, A, prior knowledge about what data do you care about.

In some of this is intuition, you know that you have certain kinds of challenges that you want your vehicle to - you want your models to perform better on, and so you might prioritize gathering that data. You could wait for it to happen randomly in the world and set up some kind of trigger on your logs that say, "Okay, I care about this and I want to keep it." Maybe you could schedule someone to go drive in a particular area where you know that it's going to happen, and you could increase the odds of that kind of event taking place. But ultimately, you do need some kind of strategy. It can't be just to capture everything, and then process all of it using all kinds of expensive algorithms and hope that you land on the needle in the haystack.

[0:07:48] TK: Yes. Definitely sounds like a tradeoff between expanse and being able to get what you need from a data and insights perspective. I'm curious when we think of your models. So, I'm sure there's a lot of different models and a lot of different contexts in which your data is used. You probably also have models that depend on other models. So, how do you make sure that the system behaves in a safe and predictable way? Talk to me about the intersection of machine learning and safety criticality? 

[0:08:17] IW: Oh, boy. I think this is one of the fundamental problems that humanity is going to have to take on in this era of increasing reliance on artificial intelligence and machine learning for all kinds of applications. One of the benefits of, I guess, we could call them legacy systems that are rule-based is that they are entirely deterministic and predictable. You know that given a certain set of inputs, you know exactly what those outputs will be, and you don't need to check every set of inputs and test their outputs, because you know the set of rules that describe the behavior. So, it's easy, by inspection, say that you will always get some outcome that you desire.

When you introduce a machine learning model, you don't have this property anymore, and it gets even harder when you have models that depend on models in a kind of network of models. Now, the question is, how do you scalability, evaluate the output space of all these models is going to do what you want.

If you think of a machine learning model as, or rather a neural network as general, a function approximator, which is attempting to map a learned set of inputs and outputs in a nice, consistent, smooth way, then you essentially hope that at each point on this curve, you're getting the function you want. But that's not a guarantee and there are potential for very weird things to pop up in there. The only way, right now, that we know of to 100% know for a fact, it would be to inspect the entire space and a huge number of dimensions. We obviously can't do that. In fact, it's not even clear exactly how you might do that for some of these problems. So, you rely on statistical approaches. This is out of my area of expertise there. There are people who look at the entire system composed together, look at it in a whole bunch of different environments, come up with proxies for events that are likely to be problematic elsewhere, and construct a statistical method of knowing that the likelihood of a certain thing is rare enough, which in this industry, and in fact, it's not just self-driving, but in any safety critical industry, there's a huge body of well-known practice of how to do this. The aviation industry is a great example of a place that has learned a lot of these lessons over time.

So, I think there are still many lessons coming about the part of it that you said, the intersection of machine learning. But the short answer is that in today's world, it requires a huge amount of work. It takes a lot of people who know the details of their models very well, to spend a lot of time evaluating it for safety and making sure that nobody's going to be put at risk.

[0:10:51] TK: That's actually a great segue. My next question for you is that in general, what are some of the biggest bottlenecks when it comes to using the data for your models? Are they in, as you've talked about, collecting, storing, and transforming the data, or in something else?

[0:11:06] IW: Another great question. Well, I think, one I touched on which I can expand on a little bit, which is getting data into the format you needed at the right point in time, and to give you an example of how this can be non-trivial. I gave an example earlier of, say, you process an image and you want to extract some features. This may not be that hard. It might be some convolutions over some pixels, and then you pull out some patterns in some feature space. But it could be much more challenging if you think about a cascade of models where some are acting on upstream models. This means that if you want to know what kind of inputs that model is going to see in a given situation, that means that you would need to replay all of the models that are upstream of that, which means you need to run inference on a huge number of cascaded models. Those models, maybe you only care about image space. But some of those other models may take as inputs, some other kinds of data.

So suddenly, just the task of regenerating a particular input to a model ends up being a full-scale replay of a huge number of models using a whole bunch of different data. Now, what happens when different versions of these models change over time, different algorithms come in, and maybe some data gets compressed in different ways. Suddenly, you find that all of these become quite compute expensive. This is a big one.

There's actually another thing I would like to touch on too, which is, suppose you're willing to throw money out or you have some clever ways to deal with getting data into the right format to use. There's the question of, which data? And the crux of machine learning, it's very exciting to talk about novel architectures, and every now and then they come out and they have some paradigm shift for the way that we think about machine learning. But ultimately, it's always a data game. Without data, without good data, without the right data and the right proportions, there is no machine learning.

So, you can capture all the data you want. But you can't just naively throw that data at that model. You have to know what's the distribution of data going into it. In this way, you make an assumption about what are the properties of that data that matters. So, maybe you can take some collection of images and LiDAR and some other metadata about it, and you can put some dimensions on that. You can say, maybe this is at night. This has so many number of people in the region, or it has people who are at night wearing dark clothing.

But how do you know which of these actually matters and how much of it should be weighted? You need to have a set of assumptions going into that upfront. So, you can see how this will fall apart when you have an incredibly rich world. Then, there are ways of generalizing this. You can take embeddings, you can run models over these things, which have some kind of high-dimensional representation, and you can try to correlate those with certain areas where you saw behavior that you didn't want to see. You could hypothesize that including that data might be useful. These are all areas that are active research areas and likely going to be the future. But this is one of the fundamental challenges.

[0:14:07] TK: So, a little bit more complicated than building a web app. It sounds like.

[0:14:09] IW: Just a little bit.

[0:14:11] TK: A little bit. Really interesting. Thinking beyond data for a moment. You've got a system, it sounds like to decide how to gather data, what to gather, how to optimize your models. In general, what kind of orchestrated machine learning workflows do you need to build a self-driving car system from the ground up? I'm curious if you can speak about the non-proprietary information that you can share about languages, frameworks, and the software systems that you're using to do this.

[0:14:38] IW: Yes. That's another great question. Let me just start with any machine learning workflow. So, as I mentioned, you need data. You have to figure out where you're going to get it from. One of the things we didn't touch on yet, and maybe I should have mentioned this as a bottleneck earlier, is that you need labels. Labeling is expensive. In certain cases, depending on the task at hand, maybe easier or harder, if you're very lucky, you could do some kind of auto labeling where you have some other algorithm that can tell you maybe given, let's take a trivial example, like you have a word and you want to know which language it belongs to. You could just like try to match that in the dictionary and assign a label to it.

Of course, if you have an image, and you want to know, where are the people in it, state-of-the-art models can do a pretty good job of this, but you maybe can't rely on just that as training data for a safety-critical system. You need somebody to look at it and check at least in certain cases. So, frameworks, you're going to need somewhere to get the data. You're going to need some kind of labeling tool. If you have a sophisticated task, then likely you'll have to build your own kind of labeling tools. There's unlikely to be something off the shelf. I'm sure there are some companies that are trying to solve this.

Then, of course, you need to process this data and put it into a format that is the right way to train your model. And as I mentioned, this, this can be quite sophisticated. So, you'll need something that that can do this. There are lots of open-source and closed source. But there are there are well-known tools out there to process data at scale. You have things like Spark. You have things like Apache Beam. There are many out there. Then, of course, you need some kind of model training framework. Popular ones are things like PyTorch, and TensorFlow, and then you need some kind of - in the case of a machine learning system where you're going to deploy that model, then you need some set of tools, which are going to take that model from a PyTorch or a TensorFlow format, and put that into something that is appropriate for the architecture and the operating environment in which we'll ultimately be running. Then, of course, you will need some kind of analysis tools for it, you know, both in the training environment, but in the execution environment, which could be quite different from the training environment.

So, things, you've got your bread and butter tools like SQL, which could be the interface to all kinds of data processing tools that will transform that into whatever distributed environment they're running in the background. If you're working in an embedded environment, where latency and execution guarantees matter, you may need to work with something like C++ or C depending on what you're using. I'm sure there are some other operating systems out there that don't require that. But if you're using a real-time operating system, these are all things that you'll need to be familiar with. It's a really a cross-section of almost every technology out there that gets applied at the end of the day, it depends on which pieces you work most closely with.

[0:17:20] TK: A ton of complexity and different frameworks and languages and components to all have it come together. I'm sure there's a lot of other interesting proprietary things that we can't get into. I have the same question for you on the hardware side of things. In general, what kind of hardware sensors or components do you need to facilitate all the data-gathering requirements that perception has?

[0:17:41] IW: Yes, great question. Before I get into that, I actually want to step back to your previous question. There's something that I didn't mention, which I think is quite interesting. This is true for any kind of autonomous software. You need a way to test it, of course, and I alluded to that a little bit by saying when you put everything together on your final environment, and how do you do that. So, especially, in the safety-critical situation, you can't just throw it out there and see what happens. You have to have a pretty good sense of what's going to happen well before it gets there. So, simulation is a big one. You need some kind of simulation tool. There are many different kinds of them. And as you have a more sophisticated operation, you'll find that you have different types of simulations that you need to do.

I don't think I can get into all those differences. But I know that, for example, there are open-source tools like CARLA is a popular one. And in fact, if you take online courses about self-driving cars and other autonomous things, CARLA will come up as a place that you can play with.

Returning to your original question about the hardware side of things. I think here is where everyone does this differently. I can't get into too many details. But I can say for the self-driving problem, of course, we have sensors, right? So, I mentioned LiDAR, radar, camera. There are things like ultrasonics, and then various types of these, there are different kinds of LiDARS, cameras, some of them are optimized for different kinds of tasks. Of course, you need these on the car. Everything you need is a lot of compute, right? You need GPUs, you need CPUs, you need redundancy. So, if one of these things fails, can the car get into a mode where it will be able to achieve a minimal risk condition or some kind of a safely stopped state?

These are the things that you need on the car itself. When we talk about off the car, well, you need a huge number of computers in the cloud, and these are all things for your data warehousing, for your indexing, for crunching features off of the data that you've stored. Of course, you need a training environment to train your models on. I talked about testing and simulation. You need somewhere to run all that. So, a huge number of general-purpose computing on the cloud, both GPUs, CPUs, and lots of storage. Then, on the car itself, where you have the specialized hardware.

[0:19:52] TK: What are some of the biggest challenges specific to perception and hardware then? I guess, they're probably all these different kinds of near-field and far-field issues, not to mention when unexpected things happen. How do you identify and deal with all these challenges from both the hardware and also software perspective?

[0:20:10] IW: Well, two big things come to mind. So, there's of course, the raw challenge that we talked about at the beginning, which is creating order out of chaos. You have these noisy spurious data readings from all kinds of different sensors that needed to be calibrated, and then you need to run sophisticated algorithms over them, and they have all kinds of edge cases. That's just to even give you a simple answer, like this is a ball, this image or not. That is a whole big, hairy ball of challenges.

But there's another one that I want to talk about and it's almost philosophical, in a way. I guess it's an interesting design problem, which is, perception is an entry point to the rest of whatever system is taking place. So, it is only through the interface that perception establishes with the rest of the system, that it can make decisions about what to do. So, you have to find the right interface. Obviously, if you wanted to pass the buck, you would just pass all the raw data back and say, "Here's the world. Do the right thing." That's not helpful.

You have to remove some information in order to put it into a structured format that's useful. Figuring out that right level, and I'll give you a most naive example is you could just pass something like an occupancy grid, which is just a 2D or a 3D grid, and it would just say, in each of the cells, this is occupied or it's not. This is something that like a Roomba probably does, so that it doesn't collide with things. It doesn't care if the thing that is in front of it is a squishy object, a hard object, a person's foot, a cat, it doesn't matter. It's just going to try not to hit it. Then, maybe the next level of sophistication is you separate things from static objects around you from things that can move. Now, you look at a room and you see the furniture and the appliances, and you recognize that those are things that aren't going to move. Maybe you see that there are people, animals, those might move. Then if we scale up from there, you can see how this gets more and more sophisticated.

So now, we're looking at objects on the road. We see some of them are bicycles. Some of them are cars. Some of them are people. Some of them are people who are directing traffic. Some of them are law enforcement officers. Some of those cars are actually ambulances and all of these kinds of nuances matter. In some way you discover these things through trial and error, like in any other industry. You discover what matters and what doesn't. But then they break down, right? So, I would say the challenging thing is figuring out exactly what is that right interface, what is the right level of information to pass back, and returning to the technical challenge of let's say, you've got that interface laid out, and now you want to implement it.

In the space of perception, you have your near field, you have your far field, you have things that are coming out of occlusion, meaning that they've always existed. There's object permanence, but they're now becoming visible to you, and then maybe they go invisible, again. A bicycle passes you then it goes out of view behind the bus, and then it comes back in. So, there are all kinds of ways that you can break down this perception problem. Different sensors have different strengths and weaknesses for these. For the very near field, ultrasonics are really useful. If you're talking about one inch away from the vehicle, you want to make sure that say you're passing through a very tight space and you don't want to collide. Then an ultra near field sensor can give you that information very well.

If you're talking about the very far field, then some of the kinds of devices out there use naively would struggle. So, when you think about a LiDAR, it's really producing a 3D sphere of points. What happens when you get far away? Well, the distance between these points gets bigger and bigger. Of course, if you're close up, then the interpoint space is quite small. But if you're talking about a few 100 meters down the road, you might have six inches, a foot, two feet, three. It really depends on the cost and the design of the of the LiDAR that you're using. It may not be appropriate anymore to detect objects and certainly not to classify them. Is that a semi-truck? Or is it a large boulder in the road?

Cameras can help you with this. Radar can help you with this. So, for each of these areas, you need to think about what is the sensor coverage? What is the right kind of blending of models? This is where you get into models that are looking at each sensor, and then some that are looking at fusions of sensors, and then some that maybe take outputs from different models and try to figure out, "Okay, the camera saw this thing, and the LiDAR saw that thing. But I happen to know that when we're talking about faraway objects that have these kinds of properties that I want to trust, one more than the other." So, there's a lot to get into there.

[0:24:34] TK: That is a fascinating bucket of challenges. I feel like there's an entire other realm of challenges here, which are the challenges that you face, running models in real-time hardware that you can fit into a car. We have all these exciting trends that everybody's talking about these days with AI and ML like LLMs, ad ranking. But the people that are working on these problems, they don't have to worry when it comes to compute as much because their models are running on these big clusters of cloud compute, and they just don't have to care as much about optimizing for latency. They can probably scale up compute as needed.

In their cases, the training environment, the execution environment are similar. But in your case, they are very different, right? Talk to me about the challenges of running models, real-time models that can fit into a car.

[0:25:22] IW: Yes. That's a big aspect that I think sometimes goes underappreciated. Even when I talk to other machine learning practitioners and people who are very knowledgeable about the space, it's not something that you necessarily think about right away. Certainly, I can see this difference having worked on speech recognition, which while we did do some recognition on device, which required shipping, kind of a cheap model to run on the phone. Most of the time, the task is performed by a server. We all know our Alexa or okay, Google, these are trigger words that open a connection with a remote server, which then does the much harder job of figuring out that open vocabulary speech that you just spoke.

But those actually have a huge benefit, which is that you train them and you ship them, and if the number of parameters in the model is a certain size, or the latency that it takes to run, there are some concerns there, and this is an important area to engineer on. But it's not going to kill somebody if it goes wrong. When you think about deploying - there are many aspects of this that are hard. So, one is that the hardware is different.

In some cases, the hardware themselves might admit different representations of numbers. You may need to quantize or use mixed precision. These techniques can affect the inputs and outputs of the models. If you've just done a very extensive safety analysis of your model in one format, now you've changed it, and you may need to - you might find surprises. Hardware differences are one big aspect of it. But really, the most important problem is the latency and resource constraint. You've got a whole bunch of different tasks running on these machines, running on this car, and they're fighting for bandwidth. There's something that's figuring out how much GPU time and CPU time do I give to each of these, given the environment and the operating constraints that I'm in.

This is just a challenge that requires a lot of people and energy and focus to solve. But it also means that it has a huge impact on the architectures of models that you decide to use from the beginning. There are state-of-the-art camera detection models that can even now give you a written description of what's in them. I mentioned earlier, we have this challenge of you need to boil down and environment into the properties that you think matter, and that there are some techniques out there to try and capture embeddings that maybe don't collapse it into just some you know N-dimensional vector saying, "It has people" or "It's at night" or whatever. And instead, it gives you this kind of more abstract, richer information.

Well, those are great, but can we get all these techniques and fit them into this highly constrained, small operating environment? So, we can't use the latest and greatest, enormous new architecture out there, and I won't name specific ones, because I'd probably shouldn't get into that level of detail. But we have to pick things that in some ways are not as good as we would like them to be, because we are dealing with very real constraints on the processing and compute power that we have, and the type of hardware.

[0:28:18] TK: So, we spend a lot of time talking about challenges specific to models, to data, to hardware with perception, and the work that you do with data. But what are some of the other teams and functions in the overall AV space? And what sort of problems do they work on to help you put together this AV that can function in these varying levels of autonomy?

[0:28:37] IW: Yes. Well, it takes a village or maybe many, many villages of people with very specific talents. We've talked a lot about perception. Of course, there's an operations staff, as you might imagine. There are training safety drivers. There are people working on the other aspects that we haven't touched on at all, which is behaviors, which is referring to what should the car do. We've perceived the environment around us and what do we do. There's a huge aspect of this, which again, is shared by all kinds of autonomous challenges, which is prediction of what other agents are going to do. So, we've observed them, up to time, T, and we want to make some decision about what our vehicle is going to do, and time, T plus one, T plus two, and so on.

That depends on anticipating what all the other agents are going to do at time, T plus one, T plus two. So, if somebody jumps in front of your car, you'd like to know ahead of time that they might be doing that. There's a huge amount of research and work that goes into that space, and it has its own major cruxes that are going to be existential in order to make autonomous machine-learning applications work successfully.

[0:29:48] TK: So, Ian, given all this depth and complexity, the different nuances of training and creating models, the hardware, the software, machine learning orchestration, what gets you most excited about the work that you do?

[0:30:01] IW: I think there's no substitute for getting to use something that you work on. So, just sitting in an autonomous vehicle as it drives around feels magical. It feels like something that shouldn't be, and yet it is. There's another really amazing thing, besides sitting in one vehicle and experiencing it, which is watching a fleet of vehicles. Leave a certain area and disperse out of an urban environment is surreal. It's so foretelling about what the future will look like. But yet, you're surrounded by all the things that aren't quite there yet. So, standing on this boundary of what will be and what is today and looking at that, it just gives you tingles. That really is the most exciting thing. Then, of course, I just want to say that there are a ton of unbelievably talented, smart, humble, thoughtful people in this space, who share this excitement and getting to work with them and learn from them is a huge pleasure.

[0:31:03] TK: For somebody who's interested in getting into the AV space, what kind of skills do you need to learn? And what kind of advice do you have for folks that are looking to break in?

[0:31:12] IW: There are so many different aspects to the problem, as we've just talked about. So, I think that if the question is just how do you get involved? Well, there are all kinds of ways to do that, wherever you are in life. We hire safety drivers. We hire operations. People. There are technicians. There are people who started doing vehicle maintenance, and then found their way into operations, and then found their way into product management and thinking about what is the right way for the AV to interact with the world in certain ways.

But we're talking about software engineering. So, I just wanted to quickly say that I think there's a way for everyone to get involved. But from the software side, if you think about, I mentioned that dealing with data is a huge problem. There are folks who are experts in data warehousing, and both online data processing, and also data pipelines, who are extremely unnecessary, and in fact, in all machine learning problems. But, of course, in ours as well.

Then if you want to get into the specialties, then of course, some of the biggest challenges, and maybe the competitive advantage that these companies really need to find are experts in computer vision, and in sensor processing, LiDAR, radar, so on. So, one avenue would be to pick one of those modalities and become an expert in it. Another would be to just look at general machine learning frameworks and architectures and become effective at those. But really, as a cross-section, there are people who do operating system performance. I mentioned that you need to have all of your different tasks get scheduled for the right amount of time to maintain a safe operating environment. So, there's really every part of software engineering that you could think of for the most part. We even have web tools that are used to administer and diagnose the vehicle to help fleet dispatchers understand the state of all the vehicles out there. So, it's a cross-section of everything.

[0:33:07] TK: Ian, if folks are interested in learning more about the work that you or Cruise are doing, what's good way to connect?

[0:33:13] IW: Well, with me personally, you can find me on LinkedIn. That's an easy way. If you want to know what Cruise is doing, I guess, the AV industry in general, I can plug a few places. There's a self-driving car subreddit. That's an interesting place to follow along. Of course, you can look at the releases that both Cruise, Waymo, Nuro, Aurora, and many other companies that make all the time.

[0:33:35] TK: Ian Williams, thank you so much for coming on Software Engineering Daily and talking to us about your experiences working on perception problems for AVs at Cruise.

[0:33:44] IW: Thanks so much. It's been a pleasure

[END]
SED 1678		Transcript

	(c) 2024 Software Engineering Daily