00:00:00] LA: Chris, welcome to Software Engineering Daily. [00:00:02] CD: Thanks, Lee. Happy to be here. [00:00:04] LA: Great. Thank you. So your Cox Automotive is – it's in a very well-defined industry vertical, right? Automotive industry’s a very well-defined vertical. What's interesting going on in automotive right now? [00:00:21] CD: Sure, yes. I'll get us started there. I mean, there's three different places that we play, and I'll just comment on each of them. First is on the retail side, things that you and I experience as car drivers and then car shoppers. They're new business models that are emerging and being experimented with. Maybe the most common is Tesla going direct to consumer with e-commerce and the ability to purchase a vehicle online, end-to-end, and have it delivered to you or delivered to a store that's not a franchise. It's a store instead, more that Apple kind of model. So that direct to consumer. You have Carvana and other dealers like that playing with the same types of ideas in the used car space of selling vehicles online, all the way through the transaction. Then there's different kinds of subscription kind of ownership models that are being experimented with. Maybe you don't buy a vehicle or even lease a vehicle, but you pay for a vehicle as a service. So Audi and Porsche are experimenting with that. That's on the retail side. On the wholesale side of the business, this is where dealers sell vehicles to one another to populate their used lots, for example. There's been a dramatic shift over the last few years really driven by the pandemic. Until just a few years ago, believe it or not, most of the vehicles that you would find at a used car dealership were purchased at auction with a live auctioneer calling. A used vehicle manager went and made those purchases in person. We've seen a dramatic digitization or shift to the digital channels. The digital channels were there. We've seen a dramatic adoption of those channels, and those have created other kinds of challenges that maybe we can get into that we're trying to apply technology to. Then the third area is – in the industry, we call it the ACES future. It's Aces stands for Autonomous, Connected, Electric, and Shared. This is the transformation in vehicles and transportation over the next 20 years that people are anticipating, and there's a lot going on there as well. [00:02:38] LA: Yes. It sounds like the industry itself is changing a lot. So Cox Automotive has to change to go along with that, which is familiar territory for a lot of our listeners because lots of industries have gone through changes, either because of the pandemic or because of other industry changes. So given that this is an industry and transformation, what is the vision of Cox Automotive? [00:03:06] CD: Yes. Well, I think I'll answer that by maybe backing up a little bit in terms of our history. So we grew both organically and through a set of kind of strategic acquisitions to bring together capabilities that power kind of the end-to-end life cycle of the vehicle. So you mentioned some of those shopper journeys at the top or on Autotrader and Kelley Blue Book. Those are the places that you and I would interact with Cox Automotive. On the other side of that transaction or the other side of that pane of glass from the consumer shopper is a team at a dealership who are managing their dealership or a dealership group. We’re providing a set of software suite, an integrated software suite into that dealership that powers it. When we acquired the various products and so forth and pulled them together, it was not an integrated software set, for sure. It was a set of silos, right? [00:04:08] LA: You mean all your acquisitions weren't doing things the exact same ways? [00:04:12] CD: Surprise. Right, exactly. There were certainly different practices and so forth. So we're building that integrated set of software to power the dealership and the retailer, and help the dealer transform into that digital kind of future that we're talking about where a lot more of the transactions will be digitized. A lot more of the – the transaction will be digitized end-to-end. So that's one area of the vision. On the wholesale side, I would say, actually, it's about the convergence of the wholesale and retail marketplaces over time. That's something that we're starting to see, where is it really any different to take a vehicle to a wholesale channel versus to also merchandise that vehicle in various retail channels? So driving sort of disruption and change into that part of the market is also a part of our vision long term and our aspiration. [00:05:13] LA: So you're vision's really all about integration and digitalization. Really is what it boils down to is you're trying – you're digitalizing parts of the industry which are not yet digital but are moving digital. But also, you got all of these separate disjoint businesses that you've come upon either by merging or acquisitions, etc. Now, you have to integrate all those together into some common framework then. [00:05:44] CD: Yes. That's right. I would say also we're enabling our customers, the dealers and the dealer groups, to transform their businesses, right? So it's powering their digital transformation as well. [00:06:01LA: Let's talk about the technology involved in doing that. So you have integration and digitalization with both a consumer and a business-based customer in mind. What are the parts of the technical aspect of your architecture that you're changing, that you're growing, that you’re – what are you doing to address these needs within your product architecture itself? [00:06:29] CD: Yes. I'll tell you where we are at our current point of the journey. Then we may want to back up because there was sort of a journey to get there. What we're really focused on today is compostability. It's sort of the notion that we're – instead of building these stove-piped products that don't integrate well with one another, we're extracting kind of the core capabilities and rendering them either as APIs, as sets of embeddable and compostable UIs, and then stitching those products or new experiences together. That's really our vision is a set of – it's a platform for the industry of digital building blocks that can be stitched together and reconfigured in different ways. So that's a major part of focus now. Now, we can only really pursue that vision at this point because of a lot of the underlying transformation work that we've done over the past five or six years. I mentioned that we had grown through acquisition. Six years ago, if you had looked at us, we were a collection of 400 scrum teams if you could count them as scrum teams because we didn't all operate the same way, right? We used just dozens of different technology stacks and had different measures for what good engineering looked like. We had different words we used for roles and responsibilities. There's just a tremendous [inaudible 00:07:58]. A lot of the transformation to get to where we are today is behind us. But it's been about creating consistency in our delivery model, our operating model, our underlying engineering platform and practices. So that's been the journey that we've been on. [00:08:18] LA: So a lot of companies. Yes. I'm working with lots of companies that are going through that transformation now. You're also – I've talked to companies who have tried making that transition and have had various levels of success with it, meaning anywhere from a decent success to no success at all and have backed off of it. So you've been successful at that transformation. What is it that made you successful? What did you do that allowed you to perform the transformation that you occurred over the last six years? [00:08:52] CD: Yes. Well, I think there – it came in multiple steps um. At every step, there was sort of a little bit more that we gained. I think the starting point for us was about getting to a common just delivery model. For us, that was, again, we had hundreds of scrum teams, lots of different ways. We looked at different options for how to sort of bring them together so that they could operate as kind of one unit and build integrated products together. We decided to go with sort of a lightweight version of Scaled Agile Framework. So safe was our um – if listeners are familiar with that. It's a little bit heavy. We didn't use all of the tools and techniques in Scaled Agile. But what it did do is give us a common kind of structure, set of rules and responsibilities, a set of cadences and ceremonies that every one of our engineering teams held in common. So that was like a starting point, just getting an operating model that worked for us. The next part for us in the journey was really about making some technology choices. I think with these, it's always about how do you balance the needs of an enterprise with the desire for having those autonomous two-pizza teams, right? So you want to have a two-pizza team that are unencumbered and can really pursue the needs that their customer has. But you have needs from the enterprise like security and reliability. We need to give them the right set of guard rails. [00:10:39] LA: Schedules. [00:10:40] CD: Yes. Schedules, exactly. Dependencies internally. So we didn't have teams that could just deliver on their own anymore. Now, we have to deliver in an integrated kind of way. That's what the market's expecting, and that was sort of our opportunity to drive the kind of transformation we wanted to drive. So we looked at a lot of different options from a technology perspective. Sort of from like highly opinionated types of platforms like pivotal to various cloud platforms. We decided to go with AWS, and we sort of did an all-in with AWS where we got our CFO and our executive suite aligned on that step of the journey with full buy-in and alignment and sort of committed ourselves to exiting 50 data centers. Believe it or not, we were running in 50 different data centers and moving a lot of those workloads in AWS. As we did, we used the [inaudible 00:11:40] type of framework that you mentioned before. Each one of the workloads sort of had a different profile and plan for how it was going to move up into the cloud. [00:11:53] LA: So you moved all your workloads to AWS now. Does that mean your data centers are gone, or you're almost gone, or you're in the process? [00:12:05] CD: We're down to a very small set of strategic data centers. We've moved everything that we started out with the intent to move. We'll say it that way. So there was a business case, and there was a plan, and there was a financial model and so forth that we landed on. It made certain assumptions about which ones we would move and how we would re-architect them and so forth. Yes, we finished that journey. Now, there's still stuff that we're lifting up because there's benefit. But it wasn't necessarily in the original case for us. [00:12:41] LA: Did you do a lot of lift and shift into the cloud? Or did you actually transformation as you migrate there? [00:12:46] CD: There was – we sort of did all of the things, for sure. In most cases, we would lift up into the cloud and then iterate from there to decompose and drive it towards a more cloud native architecture. Now, in some cases, we couldn't do that because there was a component, a legacy piece that was just – it wasn't going to operate in the cloud successfully or wasn't going to operate in the cloud in a cost-effective kind of manner. So we had to re-architect that piece as we moved. We did each of the different ways. We definitely had an example of it. What we've – now that we're all, for the most part, operating in the cloud, it’s really about optimizing in the cloud. For us, that's a couple things. One is just cost optimization. Using the fact that you can engineer your way into better cost profiles is kind of an exciting and interesting engineering challenge that you don't have in an on-prem type of world. You can get real-time you know positive feedback and cause for celebration of those kind of things. [00:14:04] LA: How effective were you at managing your costs as you moved to the cloud? Was moving to the cloud an acceptance that it was going to be more expensive but yet better managed because you didn't have to deal with the data centers and all the other advantages that come with cloud-centric? Or were you able to construct things in such a way that you're actually cost-effective in the cloud and perhaps even cost-savings in the cloud? [00:14:32] CD: Yes. I'll answer that a couple of ways. So one is that certainly as a percentage of revenue, which is one way to think about this, our overall hosting costs have remained flat as we've shifted. Have remained flat or have improved as we've shifted. That's in aggregate across the entire fleet. That's something we're proud of because there are some [inaudible 00:14:59] there, for sure. There's definitely – you take something up into the cloud, and you know that you're going to have an optimization journey. Especially with our later workloads more recently, like in the last year or so, we really were smart about how we modeled that. We knew when we get this up there, it's going to be running hot at cost perspective, and we know the changes that we're going to be able to make once we get it there to optimize those costs. Then we've got a pretty good – well, I'll say like a rigorous kind of view of our AWS resources and the costs that they're generating and ways of surfacing to our teams on a monthly basis, what we see as cost opportunities and that sort of thing. That comes from our – we have a central kind of cloud business office that provides that kind of visibility to our teams. Then it's an engineering challenge, turns it into an engineering challenge for them. [00:15:58] LA: I was going to say. So you've optimized yourself into the realm where the cloud hasn't been a cost disadvantage and, in fact, your least cost-neutral if not cost-advantageous at his point. [00:16:14] CD: Yes. I would say that. Another thing I would point out is – so for example, during the pandemic, that was an interesting test for us because I mentioned a big part of our business is these auctions, these B2B auctions which are physical facilities where people come, and they’re auctioneers. You're bidding on vehicles in the lanes, as well as digital bidders who are participating in the in-lane traffic but digitally, which has a whole set of technology challenges you could imagine. But when the lockdown happened, we shut all of those down, and we actually watched – we could see it in the elasticity in our cost with AWS. It was a great story to be able to tell to our executive team, the return on the investment of moving these things into AWS. The other thing, and this is something that I think it's lost and it's kind of hard to model, is you might be paying similar dollars or maybe just a little bit more for the cloud resources. But you're in a much more resilient posture. So we've got the redundancy here. If we're using the right set of services for AWS to power this workload, we're in a much better posture, more secure, more reliable posture. So you're not always paying for the same thing, right? [00:17:39] LA: Right, yes. It's really hard to – yes. We talked about cost optimizations to realize that we're not comparing apples to apples. You’re comparing apples to oranges because what you end up with is a system that is more resilient because if a server goes bad, it's easy to replace it. It's easy to change things up, much easier than in your own data center. It's hard to quantify when it comes to the cost calculations. But one of the things you can quantify are things like the reduced need for hot stand buys and things like that, either for resiliency or for scaling. So you're able to – you've built things in such a way so that you can avoid the hot standby problem of your own data centers and only launch new resources when you actually need them. Is that correct in the new architecture, absolutely? [00:18:32] CD: In an ideal world, that's the way you would do it, for sure. [00:18:35] LA: Yes. But in a – that is the way that you're working towards doing it or doing it right now? [00:18:40] CD: Yes, it is. I mean, it's been a journey from a resiliency perspective. In fact, this is something that about two and a half years ago we started on a set of resiliency investments. Every set of workloads has its own unique kind of requirements from a resiliency perspective and so is the right model to be active active in multiple regions or to be sort of active with a passive ability to fail over. You always have some sort of legacy in your architecture that you're also having to work through, right? We don't get to rebuild everything in a highly cloud native way so that we've got all of the options on the table. So it does require some. [00:19:35] LA: It always becomes a business decision of what you rewrite, what you change, what you update, and whether not everything is worthwhile to put that level of redundancy in or that level of effort into. Yes. [00:19:48] CD: Yes, exactly. So one of the tools we use to talk to our business about that is the notion of resiliency tiers, right? There's a cost to perfection, as we like to say. So we could design to achieve a level of resiliency, but it will cost, and let's talk about those trade-offs, and what does the product really require. Yes. The business will always say, “Well, I want mine to be diamond.” But until you turn that into business terms and what is the cost of that level of resiliency. So that's been a tool that we've definitely used. [00:20:25] LA: McDonald's burger may be okay. [00:20:27] CD: Exactly. [00:20:28] LA: That makes sense. Yes. I know I'd – one of the things I talk about in my book is service tiers which is a little bit different take. But I like the idea of resiliency tiers. What the service tiers are are the idea of the importance to the availability of various components within your system. It’s some parts of your system which are much more important than other parts of your system. A back-end reporting logging system has a different level of availability requirements than your consumer login front-end system. There's variations there. So I talk about service tiers to describe that and the interconnection between them. Resiliency here sounds like something that's similar to that but for a slightly different purpose. [00:21:14] CD: I think it's similar to what you're laying out there, Lee. One of the ways that we use it, for example, not too long ago, it's public information. AWS had a major service disruption in one of the regions, and we were able to – of course, we had some impact from that, and we've been in the process of reflecting on, okay, what really happened. What was the impact to our end customers? Did our resiliency tier framework do what we wanted it to, which is to say the things that we had engineered to provide what we call a diamond level or a platinum level of performance, did they exhibit the fault tolerance that they needed to the resiliency that could be expected? Then was that good enough for the business, right? Were our customers impacted in the way that we had expected them to be or not impacted in the way that we expected them to be? So, yes, there were a number of things that are like a silver-tier engineered workload, and they had many minutes of offline time. But it didn't result in a customer impact, and that's why it was a silver tier and not a platinum tier type of workload. So that was success, right? That was a good thing. So we're increasingly trying to get to that world where we're measuring against the expectation that we've placed on it. The expectation is based on its role in the enterprise or in our customers’ workflows or that kind of thing, exactly what you talked about. So they may be very similar, and we just use the word resiliency tier because it's the mindset we're trying to drive. [00:23:05] LA: Cool, great. The sort of transformation you've gone through is not only a technology transformation, but it's also an engineering culture transformation. Now, I know you made heavy use of the Well-Architected Framework, the AWS Well-Architected. I think you told me at one point in time that that's part of what helped you with the cultural transformation. Could you talk about that some more and talk about that for listeners? [00:23:35] CD: Sure. Yes, absolutely. Yes. The cultural transformation has been really important. I like to think about it as when you drive a cultural change, there's a combination of like celebrations and expectations, right? You could think of that as like carrots and sticks, right? Celebrations and expectations may be a better way to describe it. But talk about celebrations, it's things like what you talk about, what you reward. If we're trying to drive cost efficiency and a team has had a huge success. We talk about that. On the expectation side, one tool we've used is, like you said, AWS, the Well-Architected Framework. Maybe for listeners who aren't aware, that's a set of six pillars. There's six pillars in the framework, and each one of them is like a set of principles and best practices for designing and building and running well-architected, well-engineered workloads, whether in the cloud or on-prem. They can really apply either place. When we started using it, it was like a set of PDFs that AWS published, and it's a set of yes or no questions. Since then, we've actually partnered a lot with AW and how they've elaborated the framework. Now, it's a tool that teams can go into. This is how it works in our environment now today. Teams can go in. They're addressing a particular set of resources or accounts, and they can go through kind of a yes or no question survey about their security practices and reliability practices and operational excellence practices and so forth. It captures the information. We scrape it out through an API and load it into Snowflake. We bring other dimensions in like cost-efficiency stuff from other tools and marry it up with our agile team structure. So the teams can go and look at, okay, how am I doing against all of these expectations in Well-Architected, and where do I need to invest? Am I good on security but light on performance efficiency, and I need to invest there? The other thing it's done for us is it's become kind of a language that we share with our business partners to talk about like technical debt and not just use – I mean, the fastest way you can get a business partner is eyes to glaze over, right, is to talk about in terms of technical debt. But things like the reliability of the service or our operational effectiveness in delivering software. These are words that the business uses all the time in their context. So we use that. It's been a really powerful way of setting a set of expectations, having a set of objective measures, allowing it to be an – ease certain conversations that you need to have with product or business partners. So that's been our journey with that that you're asking about. [00:26:50] LA: Well, yes, yes. The term technical debt is one of those overloaded terms that has a lot of inconsistent meaning, unfortunately. But it's a great premise, and it does great and correctly used reasonably, I should say. It can be very helpful for an organization to determine what it needs to do, as well as its health, as well as communicate that to other people. But the problem is there's no good definition. One of the things I've always thought about. This is way off-topic, I'm sure. But one of the things I've always thought about is the term technical debt hindering our ability to use technical debt as a tool of communication. I think the answer is probably. I think the jury is still out. I'm wondering if there's some sort of term around the word complexity rather than technical debt that might be better suited for the communication of what we're trying to accomplish to non-technical individuals on the health of our systems. Anyway, that's just kind of as an aside that I'd love to hear your comments on that but if you have any thoughts on that. [00:28:04] CD: Sure. I mean, I think that the thing about technical debt, I guess, is a term, and even just the way that you have conversations about it with business partners is that they're not technologists. They're not engineers. They just expect us to build it right the first time kind of, right? Just make it work. So at times, it can be frustrating, unless you sort of build the build – lay the groundwork for having a conversation say, “No, wait a minute. These are all business decisions. They're all business decisions.” If we cut a corner now to deliver sooner, we're – and using that traditional set of words. We're taking on technical debt. It has to be paid back. The more that we can, I think, express those in business terms with the business, the better off we’ll be. So I don't know. I don't know whether I – I don't know if I would necessarily say like complexity is a better metaphor. But I think that it's the way that we serve our business partners is by like, “I can't teach you everything about technology. So I'll take on the posture of talking about our situation in business terms or like put the burden on us to talk about it in those terms.” Like I said, Well-Architect has just been a tool for us that helps in those conversations. I'll say another thing that's been valuable for us is the fact that it is a third-party tool. We looked at different ways of kind of measuring ourselves and benchmarking and so forth and internal spins on things. But the ability to say, hey, this is the way AWS recommends that we evaluate and score the health of our engineered workloads is – that's been – that gets traction internally, for sure. [00:30:11] LA: Cool. Yes, yes. I'm a big fan of the [inaudible 00:30:14] ticket framework and what it can accomplish for organizations really like yours is that really completely embrace that model. That's the whole key with it is you have to embrace the model in order to see the value. It's so easy to look at the framework and say, “Well, there's a lot of stuff here. I don't know whether it's really that important or not.” But once you realize it is, then it's a great framework for dealing with it. [00:30:40] CD: One thing we're trying to do now is really automate through instrumentation and so forth our understanding of the answers to those questions. So some of them you have to make a little bit more precise. But as you do that or put some more expectations around how you will – like for example, if there's an expectation that you have a monitoring available for workload, right? Well, if we can put a little bit more guard rails around that and say this is the specific way that you need to set up your monitor, and here's how you need to tag the monitor so that we can see it, then we can answer this question for you. So we've – kind of the next iteration or maybe a deeper version of Well-Architected, which is more appropriate at sort of a workload or group of workloads kind of at the product level. At a component level, we've started using a tool that we call PRR, Performance – or sorry, Production Readiness Review. I think we may have stolen that acronym from Google or somebody. But the PRR we're trying to automate, and it is a set of questions. Some of them are similar to what we ask in Well-Architected, and so we won't ask those anymore because we'll have detailed data on the ground. But we want to be able to in the long-term through observability of the components that are out there know their operational health or their readiness for production, looking at them in a non-prod environment. So that's where we're headed. We're still on that journey, for sure. [00:32:20] LA: So let's end by going back to one of the first things we brought up that we really didn't get into a lot yet but I'd love to get your perspective on. We talked about how your industry is changing. Automotive industry is changing faster than many or most other industries it seems like. The rate at which the industry is electrifying is incredible. The rate at which it's accepting AI and the use of AI as a technology to change the way the industry works. I'm not talking just driverless cars here. I'm talking lots of other uses for AI has been really dramatic. So that has to have an effect in your business. Your industry is modernizing. Your business has to modernize to keep up with it or lead it. You either follow, you don't follow, or you lead. I'm assuming you want to be a leader in that. How is that affecting your business, and what do you see the real direction of the industry going? [00:33:31] CD: Yes. Well, that is a big question. So first of all, absolutely, we want to lead it. We want to drive the transformation. For us, specifically, I think you're asking about where do technologies like AI and those advances fit in, and how are they going to really disrupt? I mentioned before that in the wholesale space, just a few years ago, people would come to come to lanes, and they would want to touch and see the vehicles. They would bid with live auctioneers, that kind of thing. It still happens. We're wanting to see more digitization of that space. But the key problem that we have in that space or in sort of a barrier to digitization is having a transparent understanding of the condition of the vehicle that you're bidding on. These aren't new vehicles. These are used vehicles. Everyone is a special snowflake because it has a history. It may have different kinds of damage that isn't necessarily visible from a screen if you're bidding in that context. So to drive transparency into that, we've actually been investing for a number of years and have patents and so forth on advanced kind of imagery capability. We call them fixed imaging tunnels in which you could imagine is you take a vehicle. You drive it through this tunnel that has lighting and camera arrays that take a bunch of high resolution images. They then wrap those images onto a 3D model of the vehicle. Then through computer vision and machine learning, they identify damage to the vehicle, either cosmetic damage, scratches, and dings, and that kind of thing. We also have undercarriage imaging, and they can detect where there might be leaks or other kinds of repairs that have been done and so forth and then translating that into a condition report with estimates on the cost of the repairs. That's one of the visions that we have that will drive just dramatic amount of transparency to the market. It makes the market more liquid. We see applications for that kind of technology not just in wholesale marketplace setting but certainly in consumer type of settings and other settings as well. So that's been one of the major areas that we're using. We're driving investment, and it's really cool stuff. [00:36:23] LA: An exciting space, definitely. [00:36:26] CD: Absolutely. [00:36:26] LA: Chris Dillon is the VP of Architecture and Engineering Enablement at Cox Automotive, and he's been my guest today. Chris, thank you for joining me in Software Engineering Daily. [00:36:37] CD: Lee, thanks a lot. Appreciate it [END]