EPISODE 1635

[INTRODUCTION]

[00:00:01] ANNOUNCER: Antibodies are a type of protein molecule produced by the immune system. They recognize and attach to other molecules with remarkable precision. Typically, antibodies target foreign objects like viruses to mark them for destruction. However, they can also be engineered to treat diseases like cancer. And they are one of the fastest-growing classes of drugs. 

Recently, AI-driven antibody engineering has taken off, and BigHat Bio is one of the leaders in this revolution. Eddie Abrams is the Chief Information Officer at BigHat. He joins the show to talk about protein engineering, what's different about software development in biotech, how the engineering team is organized at BigHat and more.

[INTERVIEW]

[00:00:51] SF: Eddie, welcome to the show.

[00:00:52] EA: Thanks, man. How's it going? 

[00:00:54] SF: It's going well. Thanks for joining me. I'm excited to dive into the world of biotech and biosciences today. Something that I have a little bit of familiarity with. I did do a brief stint as a postdoc student in bioinformatics. Kind of touched on this. But certainly, I think similar to you, more on the engineering side than necessarily the biology side. But I think a good place to start is maybe let's get into some basics. Who are you? What do you do? And how do you get to where you are? 

[00:01:19] EA: Sure. I'm Ed Abrams. I've been a software developer for about 25 years. I actually came to it from a totally unrelated background. I was a PhD in philosophy. And in the late 90s, I went ABD, which is all but a dissertation. And I then sort of simultaneously kicked off my software career alongside finishing up my dissertation. It's just been a personal interest of mine since the mid-90s. 

Actually, when I was a kid, I took the boys club basic class on a VIC 4. I've been doing you know full stack software development pretty much ever since the web boom you know. I started on the old virtual mols was one of the big early things in the 90s and then worked my way up. And, honestly, didn't even start doing any kind of healthcare tech. I was mostly doing web tech, and website creation and imagery tech up until about 2013 when I met Mark DePristo who was looking for an architect for a company he was working called Synaptics that was doing cool stuff with metabolomics and genomics for detecting autism. That's where I really cut my teeth on healthcare tech and life sciences. 

And ever since then, I've been bouncing back and forth a bit. And I ran into Mark again when he was starting his company, BigHat. And he asked me to come and head up engineering. I'm the CIO right now of BigHat. We've been together for about four years.

[00:02:41] SF: Awesome. Basically, philosophy, or PhD to software engineer, to now, I guess, doing engineering but sort of in the bioscience space. A really, really interesting sort of career path. And I'm sure we'll get into I think a lot of the details of how engineering within this space might differ from sort of more traditional, like what we think of software companies and so forth. 

But I wanted to give a little bit of background I think about big hat just because I think that'll help sort of ground some of the conversation. BigHat is designing safer, more effective antibody therapies for patients using a combination of machine learning and synthetic biology. For context, what is like an antibody therapy? And what's an example of one of those? 

[00:03:27] EA: Yeah. Sure. I mean, I think, typically, when you think about drug-based therapeutics right now, you're looking at like biologics on the one side, large molecules, antibodies, proteins and then small molecules which have traditionally been where therapeutics lived. And I think that there's a transition point over the past decade or so where the biologics are now actually representing the biggest and most important therapeutics that are coming out year-over-year. 

And so, we focus on the antibody side. We're building complex, large molecules to target various kinds of targets in the viral, or cancer, or other spaces that can eliminate or reduce disease. 

[00:04:06] SF: Mm-hmm. There's a lot of things that we can do as like software engineers. A lot of things that we can build. And I think maybe like one of the exciting parts about being in this space is probably a sense of feeling like you're actually doing something that actually is like helping someone directly. Rather than there's a lot of other things that we could do that maybe is less directly helpful to somebody. But what is the problem that sort of BigHat is solving that maybe like other companies haven't addressed in this space? 

[00:04:30] EA: Well, I think that there's a couple of things that are going on in the therapeutic space. One of them is that a lot of therapeutics I think are hitting the sort of wall where the low-hanging fruit of the things that we could do with therapeutics in the sort of old-school kind of rational scientific way have hit a limit. 

We have this idea of this thing called Eroom's law, which is the opposite of Moore's law, that says that finding a new therapeutic gets more expensive. It takes longer year-over-year, therapeutic-over-therapeutic. Because sort of the rational method of just sort of biologically investigating a particular molecule, figuring out how to decompose it into parts, trying to replace one of its parts to target something new, very, very fruitful. But also, you run out of low-hanging fruit and then you're starting to get into this more complex space where you're like, "Well, this thing would have worked better if we did this. Except now it doesn't produce very well." Or, "This thing would have been really on target but it also triggers the human immune system in a bad way or it kills other human cells." 

And so, this has prompted us to sort of think about how we could better explore the vast, vast diversity of the protein space, right? You sort of like have this idea that there's these long chains of amino acids and they represent this huge space of possible proteins you could build. And the old rational approach really can only scratch the surface of these tiny points that are really near other things that we know already work. 

And what we hope to be able to do is utilize machine learning and artificial intelligence to make massive progress in mapping out that space better. That will allow us to target diseases that were not targetable practically speaking the old way. It'll let us do it more cheaply, and more quickly and more reliably so that the cost of such therapeutics could go down and become accessible to more people. And I think it allows us to explore other kinds of spaces that maybe we haven't even considered yet. 

[00:06:27] SF: Yeah. In my understanding of this, in the space of drug discovery and sort of creating therapeutics, the timeline to actually go from, I don't know, idea to actually delivering a product on market is a long time. Because it's heavily regulated. You have the FDA and so forth. And there's a lot of stakeholders involved in different parts of the process. Any kind of level of optimization that you can do with reducing sort of that time span would dramatically cut essentially down the cost of actually delivering their therapeutics. 

[00:06:59] EA: Absolutely. Some of that stuff too, we have not yet like reached the point where we've been able to reach all the way down the clinical pipeline and try to optimize other parts of it that we haven't seen yet. But it's actually kind of interesting because we came about pretty much right at the beginning of the pandemic. And it was actually very interesting to see just how quickly you could get a vaccine approved when you really push really hard on every possible lever that you could. 

And so, I think that there's going to be this follow-up trend where people are going to be investigating more and more exactly the things you said. Like, how can we safely, reliably and in a really thoughtful way reduce the time frame between conceiving of something and getting it into clinical trials and then getting it back out safely and into humans? 

[00:07:53] SF: Yeah, absolutely. I want to start to sort of transition into talking a little bit about how engineering at a company like BigHat works and how that might be sort of different than traditional engineering. I know you previously worked at like GoDaddy and probably some other traditional software companies throughout your history of 25 years working in the space. But how is software engineering in the life sciences perhaps different from some of these other experiences? 

[00:08:16] EA: There's a couple different things I think that are worth mentioning. One thing that I think I've experienced myself especially growing up in the 90s boom and then in the mid-2000s when Web 2.0 was getting to be pretty big is that, when you're working on software for software's sake, the engineering organization plays a bit of a different role in the company than it does when you're in life science or in any kind of industry where you're using engineering to do something that is fundamentally not about engineering. 

We're not trying to build good software at BigHat. We're trying to make really awesome therapeutics. And so, the way you conceive of engineering isn't product-oriented in exactly the same way. When you're at a consumer-facing company or even in a B2B company and software, you might be thinking like, "Well, we can get to know our customers. We can make sure that the software is what the customers want." 

Well, the customers at BigHat are BigHat. The engineering team still is product-focused but in a very different way with a much smaller and in some ways more diverse set of needs. Because you say I'm going to make a photo sharing site. Well, everybody out there that's going to use this is going to use it for photo sharing. 

Inside the company, there's a diversity of different roles that can all take advantage of engineering, automation, cloud-based services. And you're like, "Well, it's not that there's just the people who want to see the photos." It's like you've got your laboratory scientists, and your program managers, and your bench scientists, and your data scientist, your machine learning team. Everybody's there and they need to use your software. But it's not the same kind of relationship. You have a much smaller end of customers and they're much more diverse.

[00:09:58] SF: Mm-hmm. Are some of the sort of technical challenges you might face be different as a result of that? I'm just thinking about like in terms of scale, if you're building a consumer application then some of your scale problems are like how do I scale this to like a million, 10 million, 100 million users? You're not going to have 100 million users on your end but you probably have scale in some other aspects that are maybe more downstack or like a processing intensive for some of the machine learning work that you're doing, for example.

[00:10:24] EA: Yeah. Actually, one of the things that surprised me actually coming at this as a generalist engineer actually is that, when I look at life sciences engineering as a whole, one of the things that I feel like I can see happening is that more and more people are realizing that having really good software generalists is very valuable as opposed to getting people who, for example, might have come through the life sciences and learned a program on the side. 

There's a lot of cases where you can find these pockets of really smart people who know how to make enough software to kind of get the job done piece by piece. But then the company grows, they come together. And now, all of a sudden, you got infrastructure problems. You got coding technical debt problems. Lack of CI/CD in an adequate way. 

And so, I actually think that the really wonderful thing is, surprisingly, a lot of the problems are exactly the same as the ones you'll find elsewhere. When we look at, for example, pipelines, and data management, and ETL. Oh, my gosh, it comes straight out of analytics processing, which I did at a company that had nothing to do with life sciences. And so, actually, I think the interesting lesson is that really great engineers and really great engineering can really help life sciences.

There are differences, as you mentioned. We're not going to be you know as deep into like enterprise kinds of features. We're not going to start off with the need for super fine-grained access control. We're not going to be trying to like make third-party identity services to let people have a multi-tenant access to BigHat. We're just BigHat. 

On the flip side, we'll be doing a ton of asynchronous job execution. We'll be doing a ton of like distributed programming. A ton of like model development that might have scale, like you said, in a different area or in a different way. And then I think outside of the strictly technical, the fact is that engineers and scientists come from very different backgrounds. 

And so, one of the major skills that really helps the engineering organization operate effectively at BigHat and that helps the lab operate effectively at BigHat is that we hire both sides with the idea that everybody's going to be working really closely together. And so, we absolutely crave having people who can really stop and listen and think about things a different way and learn from each other and then move forward and work together. And that I think actually is the most unique thing. Not the technical challenges in the end.

[00:12:54] SF: Yeah, I see that. And then does the company end up thinking about sort of resourcing differently where it could be, I feel in something where the organization is maybe not a traditionally, I don't know, what would be considered an engineering-first organization versus, on this case, a biology-first organization? Would it be harder to sort of make a case for why we need certain engineers to be part of the team versus, "Okay. Well, they're fine. We need to bring more biologists in," or something like that? 

[00:13:26] EA: A couple pieces there. One is this question about – actually, I'm not exactly sure what the thrust of the question was. There's two different pieces I can sort of see. One piece I can see is like it can be very difficult when you're at the leadership of a biology company where everybody isn't as familiar with the needs of engineering and not like really sort of thinking what's the right size of the engineering organization? How do we know when to hire versus when I need another protein scientist? When I need another translational director? 

That interestingly is where I come in. Because, in a way, operating in the engineering leadership in a company that is further from engineering than other companies I've worked at, a lot more of my job has to do with making sure everyone around me actually understands what it is engineering is doing and why. And how it connects into the company goals? 

We do this sort of Google-style OKR, objectives and key results analysis, every quarter. And it's up to me to work with leadership to make sure that everybody's synced in terms of how engineering is thinking about these problems and what engineering effort is needed. And, therefore, how the team can grow. That would seem one of the possible directions of your question.

[00:14:39] SF: Yeah, that makes sense. And then part of your job as essentially an engineering leader within an organization is to make sure that people understand what engineering is actually doing. Now, on the flip side of that, how much do you sort of need to know about the science and the biology in order for you to do your job? 

[00:14:55] EA: It's kind of funny. I would say it's very important that I understand a cross-section of the mechanics of how the lab works, a cross-section of how the interests of the scientific team need to be represented, displayed, visualized in software. It's not like I'm going to end up being an expert protein scientist. 

But it's actually kind of funny because I think I was talking to Mark about this really early on at BigHat and he made this comment that turns out I think to be exactly right, which is he said, "You're going to end up in this situation where you know a few things about everything about the science of antibodies that will make you seem like you're like a postdoc expert at like these narrow things. But then between all those other things, there'll be like large gaps where you don't necessarily know or need to know much." And I think that that's exactly right. 

What ultimately happens is we have these customers who have these use cases that they want to be able to run their data pipeline, get an analysis and see progress round over round for the molecules we produce and how effective they are. And you're like, "Okay. Wait, that's a problem I can wrap my head around." 

Then it goes down, like, "Oh, what about the biophysical characteristics?" Like, "Well, tell me about that." What do these numbers coming off of these instruments in the lab mean? And so, we have like these collaborations with data science, with machine learning, with the laboratory team itself who they're sometimes like, "I need to tell you what this is. I need to explain it to you so you can actually get me something that I need." It's this spotty cross-section of moments of expertise.

[00:16:32] SF: And then how are the teams organized? I mean, you have sort of a variety of different experts that need to be involved both on the engineering side, machine learning and then as well as on the actual science side. 

[00:16:45] EA: Right. Well, our teams, we're sort of like a matrixed organization. Engineering, for example, does roll up to me. And we have our own scrum and we have our own technical review meetings, and sprint planning and stuff like that. But then each engineer ends up being part of sort of these sub-teams that are working matrixed into project teams or matrixed into cross-company teams that are trying to tackle a particular problem or a particular new feature that we want to develop. 

And so, there ends up being this sort of complicated system where, in the engineering team, we are doing traditional scrub and we're tracking how the engineering work is happening across all these different efforts. And the teams are sort of organized into these sub-teams. And the sub-teams are dynamic. 

For example, it's not that we – we only have about 10 engineers. And so, we're not like trying to reorg every time the workload changes or anything like that. What we do instead is we have these epics. These sort of like scrum-style epics. And every time we generate a new epic, we look at who's available, look at who's finishing other epics. And we dynamically rebalance the teams so that we can have a certain number of tracks of these epics going at the same time. 

And each of those tracks almost always has some partners in other teams. Data science, machine learning, laboratory, science programs, partners there who helping them work with our PO, our product owners, who are sort of coordinating customer vision with engineering implementation to do user testing and see that we solve some problems for them. 

[00:18:25] SF: Is the product owner essentially like your version of the product manager? 

[00:18:29] EA: Yeah. Totally. This is a traditional scrum idea of a person who owns the customer’s voice to the engineering team.

[00:18:36] SF: Okay. And then you have essentially all these cross-functional stakeholders that are also involved with each of these like epics.

[00:18:44] EA: Yep. Absolutely. 

[00:18:45] SF: How much of the kind of information that the data that you're handling is regulated information? When I think of like bioscience and health tech, a lot of the times you're dealing with like heavily-regulated information. And I guess how much of that potentially impacts the way that you think about your engineering from a security and privacy perspective? 

[00:19:06] EA: Right now, we're in the pre-clinical phase. That is to say we're not in front of actual customers. We're only a few years old. We're getting there. We're almost there right now. 

At the moment, the way that our – we don't have a lot of clinical requirements. We don't have like FDA requirements or HIPPA. Because we don't have any customer data. What we do have is partner data. We have partnerships whose data management is governed by contracts. And we have our own data. 

From the standpoint of like internal privacy, data protection, partner data protection, proprietary information, we're SOC 2 Type Two audited and verified so that we can just be very certain that our data standards, our data policies and our data practices are all very good. 

When we hit the clinic, when we actually start ending up with our data potentially having clinical results, we'll have to investigate exactly how we want to manage that. Typically, in a lot of cases, sometimes you partner with somebody to go into the clinic. And so, they could end up being the only ones at the exposure to the regulated data. Or you can go with a CRO and we could actually have, again, an organization that's kind of just reporting back to us. And they are going to be responsible for that. 

If we end up onboarding any systems that are going to do data digestion, then we'll have to take those systems and expose them to a much higher level of clearance. But we're pretty familiar with how to do that. Because, again, at Synaptics, we had exposure with how we would do that in the case of like a medical diagnostic device or software medical diagnostic device, which is approximately the same kind of regulations you get into. We're just not there yet.

[00:20:53] SF: Mm-hmm. And in the diagnostic device, you end up with essentially some of that data falling into like an analytics store. Where does that essentially enter your systems?

[00:21:03] EA: What do you mean exactly by enter our system? 

[00:21:05] SF: Essentially, you're collecting this information from the device. Going back to the use case that you were talking about before, where is that data – basically, where does it go from the device to the point where it's touching your infrastructure and where you house it? 

[00:21:18] EA: When I say a software medical diagnostic device, there's this class that the FDA recognizes of diagnostics that are implemented as software rather than as devices. You can have, for example, an X-ray machine or something. That has a whole bunch of FDA clearances that you need to get in order to be able to use that on patients. 

In the case of Synaptics, for example, what we were looking to do was collect blood samples from patients and run them through machine learning analysis to determine whether the blood sample belonged to somebody who was likely autistic or likely otherwise developmentally disabled. 

And so, for that kind of diagnostic device, a software-based one, what would happen is you would have to be FDA-cleared through your ingestion of the data from the client, from the patient. You'd have to have an FDA-cleared software path all the way through your system that resulted in a diagnosis. And that would be the section of the system that had to have the transparency requirements, the data privacy requirements and other requirements from the FDA clearance for its operation in order for it to be used to provide diagnostic advice to customers.

[00:22:34] SF: I see. Okay. And then I want to talk a little bit about some of the stuff that you're doing around machine learning. Do you have any insights you can kind of share about how the AI-guided antibody design platform works? And how is sort of the machine learning team even organized at BigHat?

[00:22:52] EA: Yeah. The way BigHat kind of thinks about its teams that are sort of operationally involved in the – we call this system the closed loop. The closed loop is how we get from data about our proteins to actual optimized antibodies. And the operating of that closed loop is divided into two broad parts. The insights part and the mechanics part. 

The insights is like the science, the data science, the machine learning, these are the functions of the company that are taking raw data and extracting from them this is a higher quality antibody than that. Or this has failed a QC check and is not reasonable to advance to the next phase of optimization or something like that. 

This is like the insights part of it. Whereas some companies organize machine learning and data science under engineering, we think of them as part of science because these teams for us are helping us with insight generation. They are by and large computational biologists who have a strong expertise in statistics or modeling and who are using those skills in order to provide their part of the closed loop, which I'll describe as well. 

My side of the fence, the engineering and sort of mechanic side, is all about the infrastructure. It's sort of like, "Well, if those people at the other end of the spectrum can be providing us with functions that generate cool insights, what we can do is take that, run it automatically, track providence of data, make sure that everything is running reliably efficiently, cheaply, securely. We can develop new infrastructure for them to use so that their jobs are easier, so that their work can get done faster, so that there's less friction and less manual labor to get from data to insight." That's how we sort of split those two things up. And so, the machine learning and data science sits in this sort of org. We're kind of joined at the hip with because, of course, everything that happens really involves both sides of that equation. 

And that closed loop, how we generate that is we basically have a whole bunch of source data that lets us kick off – say, we're going to design a bunch of antibodies. Say, we design a plate of 96 antibodies. We assay them all for all of their biophysical characteristics. We take those characteristics and we use those to train new models. The new models basically are using an active learning loop and searching through protein space for ones that they score very highly based on the data that came in. Those produce new shots on goal antibodies that we then produce, and assess and keep turning the loop. 

[00:25:30] SF: And where does the raw data come from for the actual model training and essentially the test that you're doing around deriving these insights? 

[00:25:38] EA: We have a wet lab that we operate ourselves on-prem. And we actually run all the biophysical assays by and large ourselves. And so, we have a fleet – a suite of, I don't know, five to 15 assays we run at any given time. And we test thermostability, test binding. We assess ourselves. And so, we have all these instruments that can take the antibodies we produce in a certain condition next to an antigen or blended with some other substrate and generate this raw data. Raw spectral data. Raw data that we can interpret as binding, or thermostability, or other kinds of key biophysical characteristics. And that goes directly into the data science layer, which can then sort of model that, or QC it, or otherwise interpret it. And that's what ends up going into machine learning.

[00:26:31] SF: And how does the handoff work between essentially the team that's working on the sort of the AI modeling, the data sciences and the engineering team? Where are you essentially – your team starting to get involved to maybe support the model of being able to run these models at scale or whatever it is that you need to do from an infrastructure standpoint? 

[00:26:52] EA: Yeah. The way that works is we have a whole bunch of our own homegrown software. And that software allows you to do things like say, "Hey, I want to run this machine learning loop on AWS SageMaker." And so, what we have are all of these code repositories where people are writing PyTorch models and other kinds of tools for creating embeddings, structural information from proteins, so that those can all be stored as data sets. And all those data sets then proliferate in our data lake. And the machine learning then happens either in an automated fashion or manually. When somebody has some code, they want to then automate that to run on this cloud infrastructure. 

And so, one of the cool things about the way we've done it is that people are familiar with just running machine learning locally. You probably execute a file and you see some kind of curve of like a loss function or something being generated over time. We allow you to kind of write that level of code and try it out locally. But then you can just, with a slightly different command, run that very same code in the cloud. 

And since we're an all-data set-in, data set-out kind of team, we always know that the inputs and outputs are all going to be in this data lake that we have. And so, basically, the way it works is, as more and more experiments come rolling into our software, we're managing all the experimental interface in our homegrown tool we call [inaudible 00:28:20]. All those experimental data are saved and then turned into data sets. The machine learning loops run during new design rounds. And they produce new insights which ultimately result in new antibody production that then get into our ordering scheme, produced and then, again, back into the loop.

[00:28:41] SF: You mentioned a couple things there. You have a lake involved with storing some of this data. You mentioned AWS SageMaker. What's sort of the AI toolchain or infrastructure look like? 

[00:28:53] EA: Well, we use PyTorch and PyTorch Lightning. And we have pretty much custom-built models of various kinds that are being ensembled together and put in active learning loops so that they're basically exploring and exploiting a whole bunch of various different proposed sequences that are going to be generated by some kind of a search function that's going to like generate a series of sequences that are in some way pre-qualified so that we're not just randomly searching that space. 

And then those machine learning models are running these loops, like epics that are then, yeah, measuring test train validate kind of loop against those. And so, with all the data that we collect, we have sort of – it's supervised or largely supervised learning where we're like, "Oh, can we now train the model with the new data and then generate a new series of scores on set of novel proteins that we want to produce?" And based on a ranking of those scores, we then go out and actually get them and measure them and see if they were accurate. And that's what feeds back effectively into the machine learning. Because we actually then measure it sequence by sequence, antibody by antibody and see how it actually did versus what the score was.

[00:30:07] SF: And is the value of sort of introducing AI into this process primarily about efficiency or is it essentially offer other advantages over doing something completely different? 

[00:30:19] EA: Well, I think that efficiency is certainly part of it in the sense that we do believe that this method will more reliably and more quickly in the end result in either a fast fail or a success in terms of some kind of design objective. And I think that we also believe that this method paired with other methods, like De Novo generation of novel sequences might ultimately unlock the ability for us to get to antibodies that are like radically better, radically higher quality than the ones that we would have been able to achieve using the methods that we had in the past. 

I think there's an efficiency question, which I think is very valuable. But I actually think that we fundamentally believe that this method will both allow us to do things that were impossible before. And for even the things that were possible, achieve a higher quality. 

[00:31:18] SF: And then I guess it potentially also will lead to things that maybe a person would not have thought of before in terms of like a new approach. 

[00:31:27] EA: Yeah. That's exactly right. I mean, although there are interesting questions the further you go out in the space of possibilities as you get further and further from things that are sort of like natively seen in human bodies nowadays, you run risks of toxicity or negative human response to foreign proteins. 

Yeah. I think that we think that the space of the possible proteins is really, really huge. And so, when you think about approaching it only from the expert knowledge of biological paths that we understand in proteins today, you get the sense that we could just be missing remarkable opportunities that we'd have to like luck upon otherwise.

[00:32:16] SF: I mean, it seems like – I think in terms of drug discovery, therapeutics as more of like an engineering discipline versus essentially a luck-based thing. It's fairly new. There's a lot of information to process. And it probably be difficult to rely on only human expertise to be able to accomplish a lot of these things.

[00:32:39] EA: Totally.

[00:32:41] SF: How does BigHat sort of stay up-to-date with sort of all the things that are going on in the space of machine learning as well as in bioscience? I mean, this is a very, very I think hot space. There's a lot of innovation going on. How do you kind of stay ahead of the curve with everything that's going on in the space?

[00:33:03] EA: Well, I would say we're a pretty small team but we have a really good cross-section of people who just have been in the field for a really long time and people who are just really on top of the academic paper stream that's coming out, the industry publications that are coming out in this area. 

And so, I mean, I don't know that I can offer that novel of an answer other than that we're very sharply focused on this. And so, our leadership and our experts on the various teams are just constantly digesting this material. And we tactically hire in specifically to – we tactically hire people into the roles that are critical on the various programs that we're running. And so, in the various scientific and medical spaces that we're interested in. 

And so, we're not yet at the point where we have the generative AI telling us which are the best articles. But we'd like to get there for sure. It'd be awesome if we could have some digital assistance that was helping focus – there's a lot of articles that, after we read them, we're like maybe this wasn't so helpful. Maybe this wasn't really that interesting in the end for what we were trying to accomplish. Right now, we just have to slug through it.

[00:34:20] SF: Mm-hmm. And when you think about your own hiring and sort of building your team, are you thinking about that maybe differently than you would have been thinking about it at like a more of a traditional software company? Is it you're hiring, you're focused on sort of more generalist because of the breadth of things that you might need to tackle at BigHat versus maybe hire – building out really more specific teams that handle certain areas of the stack or certain areas of the product offering? 

[00:34:49] EA: Well, I think that, in both cases, the way I think about it it's pretty similar, which is something like this, you have this problem statement that you need to apply engineering to. And the problem statement could be decomposed into something like – let's pretend we're playing a video game and I need to design a character that's going to be able to kill a dragon with a sword and a shield. 

Well, I probably want to advance some of the physical skill trees. Okay, he's got to have a strong constitution. He's got to have really great moves. Probably good sword skills. And you'll sort of like level these people – this character up on those skill trees based on the challenge that you're facing. 

I think it's pretty similar on the engineering team. What I think has happening regardless of the kind of specialization of the team is that the specialization of the team just tells you – or the specialization of the problem just tells you which skill trees you kind of want represented across your team as a whole. Like, "Oh, I need – for us, we're like, "Wow. We really need people who are familiar enough with the scientific kind of landscape to really understand how to build the user interface that we need for our scientists." We really need the cloud people who can really do this natively on AWS without breaking a sweat. We really need the data pipeline people who understand event-driven architecture and data management. All the sort of blood and guts of like really doing schema inversion management. Stuff like that." 

And once you got those skill trees kind of lined out in terms of what your team needs to look like, you have a little bit of flexibility for a while in you're hiring. Because you're like, "Well, I don't need to hire individual people for individual specializations so much as I need to produce a balance in the hiring that ultimately matches the problem space that I'm trying to address with that team." And that's how I've thought about it so far. 

And as you get further and further down the chain, of course, you might very well end up with like, "Wow. We're doing this thing in engineering, which is like dealing with a very specific instrument in a very specific case." And you might – well, I need to find that expert and hire that person. That's been pretty rare. We've only made one kind of hire like that on the engineering team that was sort of a specialization in lab automation, which I think is something that's very useful for the engineering team to have in a company like ours. Because we are doing a lot of liquid handle automation and we're going to be doing even more in the future. 

[00:37:14] SF: And then in terms of looking towards the future and some of the areas of innovation and research that are going on, clearly, there's a lot going on in AI. But even outside of that, are there things that you're particularly excited about that investing or adopting at BigHat that would help you in terms of their mission around developing these various therapeutics? 

[00:37:41] EA: I think that like a lot of the stuff that's getting to be more and more interesting right now, we have dabbled significantly now in doing protein structure work alongside other work in order to provide more data for our models to learn with. That's obviously very interesting. And I think it'll continue to be interesting for a while.

I think that there's a lot of stuff going on in terms of the lab automation stuff as I mentioned that I think would be very interesting. As you imagine, really changing both the way that people's time is distributed in the lab as well as how often the lab can be operating even when there might not be anybody or at least a very small number of people in the lab. This has to do with like, for example, getting high resource utilization on all the instruments there by continuing to use automation so that they can be running even when, yeah, nobody's around. 

I'm excited about that stuff. I'm really excited about having a robotic arm, and liquid handling and automated lab. Because I think that that will be a game changer in terms of – we always think about what we're trying to do in a way, like you mentioned, efficiency. We're always trying to shoot for being able to do more, more programs, more partnerships, produce more antibodies, help more people without having to grow the company so very quickly alongside that. Because we fundamentally don't think that you have to. 

And so, we're incredibly process-oriented and we're incredibly heads down and fanatical about making sure that we apply smart automation decisions to kind of support people being able to just live in their expertise. I want the data scientist to be thinking about the data really hard all the time. Not worrying about engineering problems. I want the people and the protein sciences team to be able to just make Innovative decisions about assay work and protein design work without having to worry about pipetting for 10 hours a day.

[00:39:44] SF: And then I imagine some of the value around automating some of the lab too besides the efficiency gains is there's probably some reduction of, I don't know, risk to people as well of like just working in labs can be – I haven't worked in a wet lab myself. But there's always I think danger in these different situations. If you can take sort of some of the human element out of it, maybe you reduce some of the risk. 

[00:40:09] EA: I think antibody development from what I've seen of our lab safety kind of situation is it's a lot safer in a lot of ways than dealing with a lot of different kinds of small molecules, which can be incredibly dangerous incredibly quickly for people. , you're right. I mean, I think that the more we can get people away from sharp objects and dangerous objects, the better for sure. 

And I think, ultimately, as well, that that probably translates even more so to the patients downstream, right? We want to produce ridiculously safe therapeutics for people downstream. And what we noticed that going from development of a new therapeutic to actually releasing it into humans, I think the failure rate is something like 9/10 of them fail. And a lot of that time, it comes down to efficacy and safety concerns. 

And what we want to do is change that ratio too. We don't think that our destiny is to have nine out of 10 shots on goal fail. We think that we can – by applying all these best practices and applying this new data-driven methodology and this machine learning and automation-driven methodology, that we can do better for patients. 

[00:41:25] SF: Why is the failure rate so high? Is there like a main reason why most therapeutics are part of the nine? 

[00:41:33] EA: I'm not an expert on that for sure. Like I said, just reading articles in Fierce Biotech as they come out. A lot of times, I see two things happening a great deal of the time. One, there's some adverse event during the clinical trial that halts it. These people had brain swelling. We're not confident anymore in this solution. 

Or whenever you have these trials, you're stating the endpoints. You're giving some kind of statistical modeling what you expect to happen that give you the criteria for success of your drug. You see this like Alzheimer's trials famously having these very like – even the weakest success criteria, you see these drugs just not necessarily succeeding at them. Or the controversial stuff happened recently where they didn't succeed but they were still kind of approved. I'm definitely no expert. But my observation has been that it's been efficacy and safety. 

[00:42:30] SF: Mm-hmm. Yeah. And when you're talking about treating humans and the potential risk to human life, of course, the quality bar is extremely high. I'm sure that lots and lots of things can happen somewhere downstream to actually landing this thing in trials and then sort of beyond to actually being able to serve the wider public. 

[00:42:51] EA: Totally.

[00:42:52] SF: Well, Eddie, thanks so much for being here. This was really fascinating. Really interesting to kind of dig into some of your background and some of how you're sort of thinking about or how people should be thinking about engineering in the space. And how it might be different than regular engineering. But ultimately, it's really just you have data, you have problems to solve. It kind of boils down to the same sort of engineering problems that you might be dealing with in any company. But perhaps in this case, your stakeholder is a little bit different and there's a little bit different sort of domain expertise involved.

[00:43:20] EA: Totally. And like you said before, the mission's a great one. It's very easy to get behind the mission of improving human life. We're super psyched about that part.

[00:43:29] SF: Yeah, sounds definitely more engaging and fun perhaps than, I don't know, making sure someone comes back to click on a like button somewhere.

[00:43:38] EA: Exactly.

[00:43:38] SF: Awesome. Well, thank you so much, and cheers.

[00:43:41] EA: Thanks, man.

[END]