EPISODE 1841

[INTRODUCTION]

[0:00:00] ANNOUNCER: Synopsys is a leading electronic design automation company, specializing in silicon design and verification, as well as software integrity and security. Their tools are foundational to the creation of modern chips and embedded software, powering everything from smartphones to cars. Chip design is a deeply complex process, often taking months or years and requiring the coordination of thousands of engineers. Now, advances in AI are beginning to transform the field by reducing manual effort, accelerating timelines and unlocking new design possibilities.

Thomas Anderson is the Vice President of AI and Machine Learning at Synopsys, where he has spent over 15 years. He joins the show to talk with Kevin Ball about the evolving role of AI and hardware design, the challenges of training models on tacit, undocumented chip engineering knowledge, the emergence of domain-specific LLMs, and where this fast-moving field is going next.

Kevin Ball, or KBall, is the Vice President of Engineering at Mento and an independent coach for engineers and engineering leaders. He co-founded and served as CTO for two companies, founded the San Diego JavaScript Meetup, and organizes the AI in Action discussion group through Latent Space. Check out the show notes to follow KBall on Twitter or LinkedIn, or visit his website, kball.llc.

[INTERVIEW]

[0:01:34] KB: Thomas, welcome to the show.

[0:01:36] TA: Hey, very nice to be here, Kevin. It's a pleasure.

[0:01:39] KB: I'm excited to get to know a little bit more about you and what you're doing at Synopsys. But let's maybe start with you. Can you give us just a quick background of who you are and what brings you here today?

[0:01:49] TA: Yeah. I'm Thomas Anderson. I have been a long time, we call them Synopsoid, so Synopsys employee. I've been there for more than 15 years. I'm running, essentially, the AI and machine learning team in Synopsys, developing automation with AI for our software products.

[0:02:09] KB: I love this topic. I'm completely geeking out on all of machine learning and AI these last couple years. Before we dive into that, let's maybe do a quick overview of EDA and what Synopsys does in general, since we have folks across a wide range of software engineering backgrounds.

[0:02:25] TA: Yeah, totally. Because, I think, obviously, not everybody is familiar with what Synopsys does. Even though we are one of the 20 largest software companies, we're obviously a B2B business, and therefore, not everybody is as familiar with us. We essentially provide solutions what we call silicon, the software. We provide offerings to help our customers build and design their chips, as well as implement and test software offerings. Essentially, from the entire spectrum from what's called the electronic design automation software, so anything that's required to build the little chips in your phone and in your TV and in your car, all the way to IP that goes under those chips and also from a software side, obviously, it's not just hardware. You have a lot of software that runs on top of this whole stack. We provide solutions for application security testing and software integrity. That's our main offering.

Essentially, if you think about it, next time you pick up your iPhone, or your Google Pixel, or whatever is your preferred choice, think about it, that little chip in there, as well as part of the software that runs on it was essentially created and designed with Synopsys software.

[0:03:39] KB: Yeah. It was fun to me doing the research preparing for this, because I have lived most of my life in the consumer software space that starts at an abstraction layer, where I just assume all of your stuff is already working.

[0:03:52] TA: Well, that's when magic happens. See, a lot of the people, they understand software, but I think not many people are familiar with hardware, or how hardware is designed. The assumption is, yeah, that thing just works somehow. If you think about building a chip has so much complexity to it, it's not just the logic side that you have to do, like the equivalent of software that something runs on the hardware that computes something. There's also all the physical aspects, right? They need to be manufactured in fabs with very, very small nodes, and that's super, super challenging, pretty much across the spectrum.

[0:04:29] KB: All right, so let's now talk a little bit more specifically about what you're doing. How are you applying machine learning and AI to this very deep software stack that Synopsys offers?

[0:04:41] TA: Well, essentially, if you think about it, so the software we're building, our users are expert designers, or engineers, right? It's not software like Microsoft Word, where everybody, essentially, is familiar with it and just type something in it. It's very, very complex software. Designing a chip from start to finish, our customers, it can take 12 to 18 months to build a chip from start to finish. That sounds like an exorbitant amount of time as you think about it, compared to software design. Software design, people, I mean, of course, I don't want to simplify it, but write something, you compile it, and you can get something out relatively quickly.

Chip design, I remember talking to an executive at Google, actually, and they were fascinated by this. We were working, essentially, on some reinforcement learning together, and I remember he was telling me, he said, "It's just fascinating how complicated it is to get those chips out the door." It's a very, very long process. The tool chain, essentially, the EDA electronic design automation software that we provide is also highly complex, and the manuals are complex. The way you optimize your design requires a lot of human interaction.

There is a very, very high desire, of course, to reduce the human manpower that is required to get a chip out the door. Today, I would say, chips, you can take thousands of people. Really, it's no underestimation at all. It can take thousands of people to get this thing done. If you want to do it faster, if you want to do it with less people, well, we need automation. Of course, AI is the path to it. But because it's complex, it's not as easy, right? It's not as easy as say, self-driving cars.

[0:06:26] KB: I love you use that as the baseline. Not as easy as self-driving cars, which they've been working on for what?

[0:06:32] TA: I know.

[0:06:32] KB: 20 or 30 years, and we keep saying, "Oh, it's going to be here. It's going to be here."

[0:06:36] TA: That's exactly why I said that, because it's a funny joke. Elon Musk has been saying this for how many years? They're going to have it next year, next year, next year. I think he stopped talking about when exactly it's going to happen. In fact, other companies, of course, I would argue have better systems, right? Regardless, we're still not there, right? I think Mercedes has a level three certification. To get to level five, I'm sure will take a long time.

Now, think about it. Driving a car, you can argue is a very simple task. As long as I can hear, I can see, and I can move my hands and my legs, I can operate a car, right? Relatively simple rules. At least in theory, it is something that I would argue, a machine can do better than a human, simply because, I mean, the interesting thing there, in my opinion, is that the simple things that we as a human just know, to teach that to a machine actually takes time.

Then when it gets to more complex things, like, let's say you're driving on the road, and there is a situation where you have to do some emergency handling to avoid some object, or some kind of scenario, there's no doubt in my mind that the machine can do this better, because if I train it with all these possible scenarios, it can calculate exactly what action needs to take. Well, I'm just in panic mode. I don't know what to do. I've never done it before. I'm only familiar with what I've done before, but not new things, right? I've never experienced. That's interesting.

Regardless, I would still say, that compared to operating complex software, of course, driving a car is a much easier task. That tells you a little bit how complicated it is what we're trying to achieve.

[0:08:18] KB: Yes. Well, and I think one of the things that is often underappreciated outside of hardware is all those different layers that you have to go through, the different layers of abstraction. You talked about starting at the electronic layer. You're simulating physics down there to understand, how is this going to work and function. Then you have to go multiple layers up before you even get to the logical layer, which is where many software engineers start their thinking. As you started looking at applying machine learning to this, where did you start?

[0:08:46] TA: Yeah, that's a very good point. Where we started is, essentially, I would say, we were looking at, let's say, tedious, or high-toil human tasks, things that humans have to do, but it's repetitive. It's not super creative. It's something that I think, if I automate this, everybody would be happy not having to deal with it. A few examples is that maybe brings us in our first application that we developed, which is DSO.AI stands for Design Space Optimization. As an example, let's say, when you optimize your chip and you implement it, one of the important things that you have to do is you have to tune your software to meet your performance requirements, right? Every chip that gets out the door has different requirements. Some are tuned for highest performance, some are for lowest power, some are for trade-off in between.

Our tools, of course, optimize for all these things, but typically, users have to spend a lot of time in running our tools over and over again to tune for certain performance parameters and trade-offs. That's one of the earliest things that we thought about we could automate, and that's where this DSO.AI technology came out of. Essentially, rather than the human running lots of experiments and trying all kinds of options, these could be technology options, tool options, flow choices, and so on. Instead of a human doing this and just running lots of experiments and seeing what gives you good result, you can have a machine do it. A machine is much better than it. That's a typical application of what I would call as reinforcement learning-based AI optimization.

In this case, I have a complex workflow. I intuitively know based on past experience what I should tune and how I should tune, but I don't know precisely. Plus, every chip is a little different. If I have a machine, I can run lots of experiments. I can take sample points. I can make decisions. Just for example, what was done for AlphaGo. You can think of it as a very large decision tree, where I make choices. Do I go this way, or this way? What I want to do is I want to have a technology where I can prune this decision tree and get to a good result. I don't want to say the best result, but to a very good result, relatively quickly. That's what we're doing with DSO.AI. That has been, I would say, a runaway success for us, because it has really helped this process of performance tuning of the implementation of the chips.

[0:11:16] KB: To make sure I understand then, what was happening before is you had a workflow that a human could go through and they could say, "Okay, I have all of these different knobs and levers I can turn. Let me try turning this one, run the simulation, wait for however long it takes." Which is not fast. These are complex simulations. I don't know what it is exactly for you all, but I'm guessing it's hours, or days.

[0:11:36] TA: Yes. It can be weeks even.

[0:11:38] KB: Can be weeks. Turn the knobs, run the test. Oh, that didn't get quite what I want. Or that got me partway. Turn another knob, run a test. What you've done here is you've trained a model that tries to reproduce some of those same intuitions of which knobs are going to be valuable. It can pick what to turn and run those in a loop on its own.

[0:11:57] TA: Exactly. On top of it, you can learn from chip to chip, because a lot of the chip designs out there are evolutionary. They may be redesigns, or maybe 30% maybe of the logic is new functionality on the chip, but there's a lot of similarities. You can then, essentially, build also a behavioral database, where you know this type of chip, I know how it will respond to my knobs. The second time I do it when I do a next version of this chip, I have a starting point already, which is similar to what I said, a human has some intuition.

Here, the beauty is I essentially have a centralized brain that has all this information. It's not distributed in people's heads. People may leave the company. They may move to a different project and then you lose this knowledge. If you have a system that captures it, it becomes way more powerful.

[0:12:52] KB: Can we dive one step deeper and talk about building this first version of DSO.AI? What types of models were you using under the covers? What does that tool chain end up looking like?

[0:13:02] TA: I don't know if that goes too deep for all the listeners experience, but I'll try to keep it high level. We started focusing on what's called synthesis in place and route flow. That is a very complex part. Essentially, that's the part where you have a description in a language, similar to, say, a C language, but it's called RTL, or Verilog. You have a description of what the circuit should do. You translate this into Boolean gates and then you implement those Boolean gates, essentially, towards the transistor level on a physical layout, on how they're interconnected, how they talk to each other and how the electrons move between the individual gates. That part of the process is one of the most complex ones with very long run times, and that's where you have a lot of knobs to tune.

I think, because I have a lot of background in this space and I've been in this space for quite some time, I started thinking this could be a very good example of automation. We tried various different algorithms. We settled, essentially, on some form of reinforcement learning, ultimately. Like I said before, we got some inspiration from AlphaGo, because that was a prime example of how, essentially, they were able to solve a very, very complex problem without having the computational expense of it.

I don't know if you remember the story even from the 1990s, when IBM built this Deep Blue supercomputer to beat. They did beat Kasparov at the time, the world champion of chess. The system that they had would not have been scalable, right? It was, essentially, sort of, not completely, but it was to a largest and brute force and computing all the possible choices. If you think about it the way a human operates, a human doesn't play chess by calculating all possible combinations ahead of time. A human has cognitive abilities and he knows, "Ah, I know this scenario and I know what to do here."

Anyway, so this was our inspiration and we use this technology to optimize the workflow. I still remember in the beginning, it was primarily about productivity. Meaning, I can get to the same performance that a human designer was able to do by tuning knobs, but the story was we can do the same performance, but we can do it faster and obviously, with less people with automation. The interesting thing that we found is that we were actually pretty much always also able to get better performance. Our customers, they got excited not just about the productivity, but also, about the aspect of getting higher frequency on the chips, or lower power. That's how we started this.

This really was an incubation within the company at a time where I would say, 2017, roughly, where ML and AI was already pretty hyped up. But nobody really had any idea how to make it work in our domain. That's why I think this was a good first application for us to develop.

[0:16:01] KB: Yeah. Well, just to summarize a little bit of what you're saying there, because I think it is a helpful mental model. The process sounds a lot like a recurring theme for me, like a compiler, where you're starting from this high-level language, Verilog, you're compiling it down, gradually lowering it into logical computations, then into gates, then to a physical layout. This first implementation, it sounds like, was looking at that physical layout step where you have a lot of different knobs and whistles and applying a reinforcement learning approach to pattern match and say, "Oh, for this, I want to tune this way, that, I want to tune that way." You were able to achieve not only the speed up, but the performance.

[0:16:38] TA: Yeah. At a high level, that is a good understanding. That's correct. Yes.

[0:16:42] KB: Let's maybe then look a little bit at what came next. I immediately, if I'm thinking about that layers of compiler step, I'm wondering, oh, do you look at some of those other stages as you're lowering things down? Or where did you go?

[0:16:55] TA: Yeah, exactly. That's what we did as a next step. The reason I would say, why we developed DSO is because myself and other members of that initial team knew this space really well. But there's many other areas where you can apply. We have verification tools where, essentially, you run tests and try to achieve certain coverage of your design with those tests. You have test equipment where, ultimately, you need test patterns to test the accuracy of your chip. Make sure it actually is functionally correct. That is done on test equipment, and there's software that needs to essentially come up with the fewest amount of test pattern necessary to reach a certain coverage. Because the more test patterns you have, the more time you have to spend each and every chip to test them on the test equipment. This is what's called the test pattern generation.

Then there's many other areas. There's even on the analog design side, there is this whole transistor optimization. When I move either from one node, I move a design from one node to another and I need to come up, essentially, with a new circuit optimization, or if I start a new design. These are all very good applications, also, for this space optimization, because they're essentially complex workflows. When I look at it as a user, to me, it's a black box. Of course, like I said, you have some intuition, so it's a little bit more than a black box. I have to tune knobs to get some certain results. All these types of applications are really ripe for a system that, essentially, sits around and then optimizes the workflow. That's what we did.

When you hear the words Synopsys.ai, that's essentially the expansion of the DSO idea to other domains. We have now verification space optimization for verification. We have test space optimization for the test pattern space. We have analog space optimization for analog space. Now, one of the more recent things is 3D space optimization. Because when you design three-dimensional chips, so traditionally chips are two-dimensional. When you do three dimensional chips, you have a lot of design choices to make and how you design the interconnect. You have to watch out for thermal between the different dies that sit on top of each other. There's lots of design choices that you have to make, where you want to have a system that explores that for you, rather than manually.

[0:19:18] KB: A few different questions this sparks for me. One is, to what extent are you able to do this in isolated slices? Are all of these different places where you're inserting an optimization automation completely decoupled from the other in the sense that you can put this in here and not worry about its impacts upstream, or downstream?

[0:19:38] TA: Yeah, that's a good question. I think the examples that I brought, like verification, digital implementation, test, and so on, we wrap around, essentially, slices in there. That is correct. You can theoretically wrap it around the bigger space. The issue, I think, was that, at least at this point in time, and I think this will change especially once we go to Gen AI, but the issue right now is the reason why this is broken up into these steps is because there is still a human that owns that piece. There's a human that says, "I'm the verification guy." There's another human that says, "I do the synthesis, the logic synthesis, the compile step." Then there's a person that does the physical implementation and there's a person that does the sign off testing. These are all different engineers and they hand pieces over to each other.

It's like a little bit the victim of an established workflow, or tool chain that was done simply because of the complexity, all these steps that had to be broken up into like, hey, there's multiple people doing these piece. Like an assembly line, where there's people sitting there. Ultimately, I think as we reduce the complexity of each of these steps further, I can totally envision that maybe you have an optimization that goes around bigger pieces, rather than those individual slices.

[0:20:56] KB: As you were applying this to these different slices, were you able to more or less reuse the same approaches and tool chain, or were there places where you had to innovate a little bit further?

[0:21:07] TA: Very good question. Yeah, there's reuse, but there's also new technology required, because not all the problem spaces are exactly the same. As an example, in the analog design space, where I'm operating at a transistor level, the response function of the system is quite different. The nice thing there is actually, that I can take more sample points because usually, you optimize relatively small pieces of the circuit and I can run lots of simulations really fast, so I can take a lot of sample points. That requires slightly different algorithms, or at least you can benefit from the fact that I can take more sample points. Yeah, it's not all the same. There is reuse, but it's not 100% identical what we're using across all these different applications.

[0:21:52] KB: Another question related to this. One of the big areas of challenge, I think, and people are running into this as AI leaks this way into more and more things is these move us from a very deterministic world of software to something that is much more statistical distributions. The question of, how do you evaluate your models? How do you validate? How do you make sure that if you're changing the model that you're using, or you're training it further, you're not breaking old use cases? What do you all do for that?

[0:22:23] TA: That's a very good question. The interesting thing though is that the AI technology that we have are still companions to the underlying, like the traditional EDA software. Therefore, at the end of the day, what I'm replacing is what a human operator does. A human operator is not without error either. There's always checks and balances, because let's say, you implement a design and let's say, I relied on some ML model to tell me which knobs to tune. In the end, there's always a verification step that is independent, where I can make sure that the result that comes out indeed is accurate.

Having said that, it is true that I think it becomes a little bit maybe more difficult to debug. If there's a problem that happens, where does it come from? Is it because my statistical model in the ML model wasn't correct? It's just a different way, I would say, to debug a problem. It's not that in a non-ML, AI way, it is all that easier, because there, like I said, I have to rely on humans, and humans make mistakes, possibly more mistakes than a good trained model.

[0:23:30] KB: Yeah. I love a thing that you said there, which is, I think, many of the most fruitful applications of AI to date are those in which the generated output has a non-AI validator in some form, or other.

[0:23:44] TA: Yeah, exactly.

[0:23:44] KB: That was your space. You were validating all these things, so it's a natural fit. That makes a ton of sense.

[0:23:51] TA: When people ask me, what is AI good at or so? I always say, well, in my mind, the intention is to, essentially, automate, or at least assist with the human efforts. To give you an example, let's say, in our space, when I go from the logic gates to the physical design, one of the common steps there is it's called physical placement. Let's say, you have one million of logic gates that I need to place on my die, actually, today, it's way more. It's probably 10 million.

Now, for these type of things, I think it's pretty obvious that a human will never be better than some algorithm. Therefore, to me, this is not really an application of AI. It's a different problem. It's an optimization problem. If I want to optimize this replacement, which is similar to say, graph optimization. I want to, for example, optimize the total wire length between all these cells that exist on my chip. There's no doubt that a traditional optimization algorithm is better than the human. It's like saying, I'm trying to compete with a calculator. There's no way I can ever do it. A human isn't really meant for that. A human is meant for cognitive tasks, creative things.

That's what I'm trying to automate with AI, which is why, like I said, the AI to us is a companion, but it has not replaced, and I don't really see it replacing the traditional computational algorithm, so to speak.

[0:25:18] KB: That makes sense. You mentioned a little bit that this was not Gen AI. This was some of these more traditional models. Are you using Gen AI in any domain?

[0:25:27] TA: Yeah, of course. I mean, who isn't? Everybody's working on Gen AI. The thing is, of course, Gen AI is, I think it for different applications. Gen AI is very good at summarizing lots of content, or taking lots of documents. You can ask questions about certain things. It's good at generating content also. It's a different application that is really complementary to the other AI optimization that I just talked about, but it's not replacing. It's not necessarily better. It's just different. Good examples, and I think every software company, or actually, not just software, any company is doing that. It's essentially building some type of chatbot that answers questions about our tools.

We have endless documents. Because our software is highly complex, it is a lot more than user manuals. We have user manuals. We have lots of other documentation that give you guidelines on how you can use the tool to better design your chips. Building a system there that answers these questions, initially just in a static fashion, meaning like ChatGPT. I asked the question, gives me answer. I think that's an obvious thing. That's what we have already developed. We call this the Synopsys.AI co-pilot. Obviously, it goes much beyond that.

Ultimately, what we're working on is technology where I can ask questions in the context of my actual design. I don't just ask that question, how do I run this tool? What does this command do? How do I set this up? Instead, I want to be able to have a window where I say, what's happening in my design? There's a problem. How do I fix it? These are the things we're working on.

Then, of course, agentic workflows. I mean, that's the big topic. Automating some of these more complex tasks, where a human is involved today with reasoning ability, analyzing a particular pattern and saying, "Ah, I know this. I've seen this before. I'm going to do this as an action, or at least try multiple choices until it is resolved." These are all the things Gen AI is really good at and where I think we see a lot of potential in further automation.

[0:27:37] KB: It sounds like, this is in progress, so I don't know how far you can go on this, but I'd be really interested to understand a little bit, first, starting in that, answering questions in the context of where I am. One of the big challenges with all of these things is, how do you figure out what's the right context to load up into your Gen AI agent, or chatbot, or whatever you're doing to be able to give relevant information? In the context of Synopsys.ai, or Synopsys tooling, what does that context stack look like? How are you thinking about, what do we pull together to put into this chatbot to allow it to understand your current situation?

[0:28:15] TA: When you think about it, it's not as difficult as it may appear at first. Because, again, if I think about, what does a human do? When a human, let's say, interactive on software, there's ways of just launching something with a bunch of commands, but there's a lot of interactive work, where he has to look at images of the layout, for example, and see there's a problem here and then they do this, or he looks at, say, reports. A typical example is if you don't meet your performance gains, like your frequency is not met, say you have too many levels of logic in your gates, or they're placed poorly. People will look at the layout, they will look at timing reports that show you the sequence of the gates in those limiting, essentially, timing paths.

This is all, if you think about it, a human, essentially, primarily interacts with our tools through visualization, through looking at reports, and you can automate this. You can have the system read the same reports, look for patterns of what stands out, teach the system, if I see a particular pattern, what action would I take? Because you can, while it's, of course, much more complex than say, going to the doctor and saying, "Well, my knee hurts," and the doctor, he has a decision tree in his head. He will say, "Okay, let's do an X-ray. Let's see what the X-ray says." X-ray comes back and then he either sees something, or doesn't see something. Maybe orders another test, right? It's a relatively, comparatively, I think to our world a simple decision tree where he goes down, but we can also create those decision trees.

Humans would say, "Okay, I have this problem in my design. I'm going to look at my layout map I see, aha, there's something we call congestion. Meaning, there's too many routes in a particular area." There's ways to fix it. I can teach those things into the system. The hard part I would say in this whole endeavor is getting all this information into my system, because many of the more complex things, they're not necessarily documented. User documents have lots of information, certainly, but in the day-to-day job, what people see and how they solve problems, they know it, they have it in their head, but it's not written down. One of the big tasks for us is to extract this knowledge from people's brains and get it into the system. That is one of the challenges, I think, that we're tackling.

[0:30:38] KB: I'm working on this in a different domain, but this concept of, we have so much tacit knowledge. It's not written down, it's not even well codified, and most people are not very good at expressing it. They can't tell you how they're doing it, it functions. How are you approaching that? How are you drawing out that tacit knowledge and crystallizing it into your tooling?

[0:31:02] TA: Okay, that's a very, very good question. First one is, I think you have, I would say, trusted structured data. For example, when I built a chatbot about our user manuals, we have trusted documents that exactly describe the behavior of certain things. There's a certain amount where I can ask people to just write down the common things. But the thing is, I think that's a good baseline. The truth is, I also need to be able to have a system that can, essentially, consume all kinds of, maybe less trustworthy and unstructured data, which exists out there. For example, we have annual user conferences, and people submit papers there, and they describe how they solve particular problems. They're in PowerPoint, they're in Word. They may be from five years ago, maybe they're not applicable anymore. But I have a lot of data that I would say, isn't a 100% trusted, but it does have value.

The hard part there is, I would say, is I need a system where I can have, essentially, sources from 100% trusted data, but I also need to consume all these other documents that may have some nuggets of good information there. There is some pretty good breakthroughs, I think, in the industry. For example, I don't know if you are familiar with this deep research capability that was released a few months ago, that essentially, has the ability to do research, and it takes longer. It's not a quick look up where I can give you an answer in seconds. It, essentially, almost acts like you have a PhD student and you tell him, write me a thesis about this thing, and the guy goes to the library and searches all this information. He finds something here and then, ah, there's more information there. It builds the information from that. Anyway, we're using this type of technology and we think that's quite powerful.

[0:32:56] KB: Yes. Actually, this question of what's trustable and what's not is also really interesting. We had another Software Engineering Daily interview where we're looking at knowledge graphs and talking about that. I'm actually curious. One of the ways I think about that is you essentially want to construct a knowledge graph of facts, but you want those facts to both have backlinks to their corroborating evidence and some amount of maybe Bayesian, or other confidence level on them. How are you approaching it? This is really what I want to dive into.

[0:33:26] TA: I mean, yeah, it's a combination. It's like these graph racks, which essentially is a combination of a knowledge graph, where you start building the relationships of what is trust and what's not trusted. The deep research uses some of this, but it does more, because it can actually search all kinds of information that's not in the rack. The thing with the rack is everything in the rack is there, and I can have a knowledge graph that structures it, where I get a sense of what's more trustworthy than others. What I also want is I want life documents. I want to be able to actually search on the fly and there may always be new stuff.

In fact, every day I may have new stuff. It's a combination, really, of this graph rack, which builds this relationship of trusted and less trusted documents, or data. Then, there's also the ability to research live things that is not in my rack system.

[0:34:20] KB: That is super cool. You talk a little bit about agentic workflows and agents and things like that. What does that look in the Synopsys world?

[0:34:29] TA: Actually, recently we introduced a high-level roadmap at our annual user conference. Our CEO, Sassine Ghazi, introduced that, essentially, what we call the agent engineer. In fact, I'm going to go back to a quote from Jensen Huang from NVIDIA, who, I think two or three months ago, he said they're obviously one of our biggest customers and using our software in designing their GPUs. He had a quote saying that, ultimately, and actually, not that far out, there will be a human engineer sitting with virtual engineers. He mentioned, even Synopsys engineers are sitting there together designing the chips.

If you think about it, so the way we envision this is that you have initially a bunch of, let's call them little task level engineers. They do small tasks. Let's say, you're an expert user and you are closing, you're working on a design and you have a particular problem, let's say, congestion, or timing problem, or something like that. I can have a little task level agent that can solve this problem for me. It's like having a junior engineer that you give a task tool, but not the most complex task. It's just a sub task, right? We're building those task level engineers, ultimately, of course. Now, I go back to the vision that our CEO has painted.

Ultimately, we envision, of course, complete automation and we model this along the lines of the self-driving car level. We introduce level one to level five, where level one is, essentially, just an assistant that answers questions. Like chatbots, even context, they wear chatbots that give you suggestions, but they don't necessarily do anything yet. You can ask them questions and say, how do I fix this particular problem? You can even ask it about your exact design to give you a suggestion, but it's just a verbal response. It doesn't do anything.

The next level and that's level two are those task level agents where I have a lot of agents and they do very specific tasks. I have a problem with my time enclosure. I say, you go look at it. Then it will run all kinds of analysis, figure out what the root cost could be, and then try to resolve it. There won't always be a single solution to it. It may have to try multiple things to solve it. Then the next level is an orchestration level, where you use all these task level agents to solve a bigger problem.

Then ultimately, at the very end, when everything is fully automated, which of course, will take time, you have everything automated, including the generation of the actual content. Because right now, we're only talking about running our EDA software. The other big piece is actually generating the inputs, like writing, essentially, the software spec. They start with specs that are written in human language of how this chip should operate. Ultimately, what I want is that I just describe what it should do and in the end, the whole chip comes out. That's, of course, a long way out.

[0:37:36] KB: One of the things this brings to mind is in the software world, there's a lot of talk about the way that AI assisted coding tools are ramping up productivity and even changing the structure and size of teams. Maybe, instead of having your classic six software engineers, a PM and a designer, you just have two or one and you have these AI agents that are assisting. How does that look in the hardware world today, or where you see this going? Is this also changing the team structure of how people are building chips?

[0:38:04] TA: Definitely. It definitely will. In fact, there is a component that's very similar to the software industry and that's the part of where I'm actually writing the description of what my hardware should do. Again, that's written in software. You can say, that's very similar to writing C or Java code. It's just a different language, right? Let's see, it's called Verilog, for example, is one of the languages. It's just a different language. There, you have coding assistants and they help you write modules, they help you understand the context.

In the bigger picture of your chip design and they, of course, will reduce the size of those design teams. Then, there's the guys that take that input and essentially, run it through the tools and there's more inputs, by the way. There's all kinds of other descriptions of how your power tables should look like and your test structures and there's lots of more specification that needs to be done. All these things, I think, will be automated more or more initially just with assistance, but ultimately, more and more, and that would reduce the size of design teams.

Having said that, the other thing to consider though is those chips are getting bigger and bigger and bigger and bigger. Therefore, I'm not necessarily saying people will lose their jobs. I think people will do bigger chips, or more chips with the same amount of people. That's how we see this.

[0:39:27] KB: There's a related question I have here. One of the things I see in the software impacts is we will probably actually see many more software companies that are small, because once you get the cost of creating software going down. To what extent do you see these changes changing the business model in the EDA space? Is this actually making fundamental shifts to how this all works, or is it more the incremental improvements, like you were talking about today where it's like, oh, it helps this person in this specialization do their job faster?

[0:39:57] TA: That's a very good question. Even in our industry, there's a lot of smaller startups and they weren't necessarily driven by what you just mentioned, that it just becomes easier to design chips. But we actually have a lot of startups in the AI chip space, right? Because while we're building the chips to run AI, those chips are then again running our software on top. That's the interesting cycle there. There's a lot of demand for new chip designs and therefore, we have quite a lot of startups. Having said that, I think the world is a little bit different than software design, and I tell you why. A lot of the stuff we're building, we're obviously pre-training it with data that we have. The world is a little different in a sense that the amount of data that's, let's say, publicly available, or even for us available is limited, and every one of the big chip companies has their own secret sauce. They have their own designs and have their own ways of writing certain things. They, of course, are very protective of their IP.

Because of that, when we ship our software, we pre-train them, but we always have to train them at the customer site. Then this customer has a special version of our software that was trained on their designs. Why do I mention this? I mentioned this, because actually, for small companies, it makes it harder, because the training data isn't available to everybody. Unlike in software, in software you can argue the amount of C++ code that's out there to train a system is very, very, very large. The amount of hardware design that's out there is much smaller and it's never the cutting-edge stuff. That makes it very hard for smaller companies to be competitive, because they will not have the training data to make those systems work as good as the big companies.

[0:41:44] KB: Yeah, that makes a ton of sense. Because if you think about foundation models, they're trained on the whole Internet, essentially. Not quite, but order of magnitude and millions and millions of lines of code. As you say, the hardware world is much more sparse in terms of the data that's available, and you have all of these different layers that I assume have quite different data as well in terms of what they're outputting. Are there any particular ways that you think about training models to survive in that data sparse world?

[0:42:17] TA: It's essentially, the data sparse world has been us from the first day when we started this, and I always knew that. The way to go around it is, to be honest with you, there's all kinds of approaches, like synthetic data, but I personally don't see this as the answer to things. I think the answer is that you have customized model that you train at the customer site. Customers are perfectly happy with that, as long as those trained models don't leave their site. To give you an example, let's say, we have a capability that generates this RTL code, right? You have a natural language description that says, write me a component that does XYZ, a counter, or something more complex. Then it will just create that for you.

Now, you have the basic capabilities, but then we ship it to our big customers and then we train it on all the RTL that they have, and you can train that A, through the rack system, through fine tuning of the weights in the LLM. Then, again, they have their specialized version. We give them the pipelines to do that. They just enter their data. We help them, essentially, update the weights and update the vector DBs, and so on. Then they have a specialized version. That's how we deal with the sparse data. We fully recognize that the data that's out there is we have part of the data and our customers have the other part of the data. We essentially work together to make it happen.

[0:43:40] KB: That makes a lot of sense. To understand, you're almost building these foundation approaches, but your expectation is people won't use that out of the box. You also provide to them the foundational model and the pipeline for fine tuning it, and here's how you ingest your data sources to become part of our rag source of truth for you.

[0:43:59] TA: Exactly. The interesting thing is actually, I think our customers like that, because it finally gives them a way to be differentiated. If I just have traditional tools, everybody gets the same tools, right? Then, I remember even 10 years ago, they would come to us and they would say, "But how do we get something that's special just for us?" You know what I'm saying? That actually is now finally enabled. In the past, that wasn't the case. Everybody got the same release of the software. They can tune it a little bit. They can write layers on top of scripts and so on, but it's much more limited than what you can do today.

[0:44:35] KB: Yeah. That's really nice. It also, for you provides a little bit of lock in, because they put that investment into getting their stuff tuned into your system.

[0:44:41] TA: Exactly. Exactly. That's very true. It's a ecosystem play. That's very true.

[0:44:46] KB: That's very nice. Question that comes to mind for that though is, say, you develop some updates on your side, or you have a new model that works better at that foundation level. How do you then roll that out to customers who've already done all their layers of fine tuning and improvement on top?

[0:45:04] TA: Yeah, that's a very good question. I would say, I don't think we have the final answer there yet, but we have already experienced in the development over the last two years that there's constantly new models. Now, the nice thing is at least rack type systems, you can usually plug in a different model and it still works fine. For fine tuning, that's true. You may have to rerun things, but it's usually not that expensive to do. It's like an overnight compile. It's practically speaking, not big effort to make it happen.

[0:45:35] KB: Makes sense. I realized we're coming close to the end of our time together. Is there anything we haven't talked about yet that you think would be important to touch on?

[0:45:44] TA: I think we talked about pretty much most of the things. I mean, I touched upon the agentic thing, and maybe I just wanted to highlight again, that in a world like ours, the choices, the agents are pretty complex to build, because there is just so many. It's not like an HR workflow, where there's 10 choices I can make. If it's this and then I click on this, and if it's this, I click on that. Our world is so much more complicated.

Nonetheless, I think I'm very excited about the agentic Gen AI and what it can do to automate, essentially, these tasks that humans don't really want to do anyway. You know what I mean? Because they're annoying. I think our goal is and our hope is that the creativity of the human remains. I mean, this is almost a philosophical question at which point, or if at all, will AI supersede human creativity? It's a very philosophical question. I think at this point in time, I would say, nobody should be afraid of AI. They should embrace it, because it will make their life much easier, because it takes a way to task that you don't want to do anyway. They're tedious and they're annoying. You can focus on much more creative things.

[0:46:51] KB: Yeah. I think it's an interesting balance, we're all trying to figure out, right? Is this going to take our jobs? Where is it? From what I've seen in the software world and tell me if this is the same for you in EDA and hardware, you still have to do most of the decision making. You're still having to say, "This is what I want. Even, this is the right architectural approach." Sometimes the tool will suggest something to you. But if you turn your brain off, you're going to end up with garbage.

[0:47:17] TA: Yes, very true. Honestly, I think that will be the case for many more years that the human still has to be in charge. That is just very true.

[0:47:26] KB: Awesome. One last question, so we talked about agents and maybe that's it, but are there any other big opportunity areas, or unsolved issues that you're looking at that you think is where there's some really interesting stuff to come?

[0:47:39] TA: There's lots of unsolved things though, but I don't know what the solution necessary will be. I think from the agentic side, the one part that everybody is still working on is improved reasoning capabilities. You have all these AI tests, for example, that essentially tests how good the reasoning is for particular problems, but these are always very specialized things. Almost the things that maybe you give a student in a sad task, or something, or recognition of what certain patterns. Almost like what would you see also in IQ tests. They train them for these particular problems, but that still doesn't mean you can solve other problems so well.

My point is the reasoning ability still need to improve for us to make this agentic world a reality. I think there's still quite a bit of work to be done. Having said that, I think things are moving in such a fast pace that I'm sure if we would meet next year, maybe this is already solved.

[0:48:38] KB: Well, and this is a place where you may be well positioned as well, because I think the reasoning capabilities, as you highlight, are often, they're being trained via reinforcement on particular types of reasoning activities. The reasoning that goes into hardware design is closed down. It's not out there in the public space.

[0:48:57] TA: Exactly.

[0:48:58] KB: It's not going to be there in the folks that they're hiring to take these things through their paces. But who has access to that? You do.

[0:49:06] TA: That is very true. That underlines the need for domain specific LLMs that are trained on that. Yes, the data isn't publicly available. Therefore, it's a good problem to have.

[0:49:18] KB: Let's call that a wrap.

[END]