EPISODE 1937

[INTRODUCTION]

[0:00:00] ANNOUNCER: Software engineering has developed powerful tools for observability, data management, and continuous testing, but hardware engineering has largely not kept pace. The feedback loops, tooling, and infrastructure that software engineers take for granted simply do not exist in most hardware programs. Nominal is a data platform built to help hardware organizations move at the same speed as software teams. It manages the hardware data supply chain end-to-end, from ingesting high-frequency sensor data off physical assets to enabling real-time control room monitoring, post-test analysis, and simulation correlation. Jason Hoch is the Co-Founder and CEO of Nominal, and he has a background spanning distributed data systems at Palantir and cloud infrastructure at Vercel.

In this episode, Jason joins Kevin Ball to discuss why hardware engineering has lagged so far behind software in tooling and observability, the unique data challenges of working with high-frequency time-series sensor data, how Nominal handles both real-time control room workflows and post-test analysis, why AI agents are transforming software development, but have not yet made the same leap in hardware, and what it would take to close that gap.

Kevin Ball, or KBall, is the Vice President of Engineering at Mento and an independent coach for engineers and engineering leaders. He co-founded and served as CTO for two companies, founded the San Diego JavaScript Meetup, and organizes the AI in Action discussion group through latent space. Check out the show notes to follow KBall on Twitter or LinkedIn, or visit his website, kball.llc.

[INTERVIEW]

[0:01:49] KB: Jason, welcome to the show.

[0:01:51] JH: Thanks for having me, Kevin.

[0:01:52] KB: Yeah, I'm excited. It's going to be fun to talk with you, because the subject area of hardware and hardware testing is a slant for me. It's real close to what I do, but it's not what I do, so I'm excited to dig in. Let's start with you. Can you give me a little bit of your background and how you got into what you're doing today with Nominal?

[0:02:11] JH: Yeah. I'll give you a longer story, which is that I started my life as someone who was more of a mathematician than an engineer. I think software was a way for me to enter engineering. I found love with distributed data systems. I built a lot of software at Palantir early in my career. I also found love with working for really technical users, people who had expertise in something, but not necessarily software and data systems. Nominal, for me, is the culmination of some things that I'm really passionate about. It's growing software teams and products from scratch, but also with big ambitions. Meeting those really technical users, in our case, it's hardware engineers, people who have obsessed over electrical systems, aerospace systems, and then frankly, just solving really hard, but fun, software problems.

Full stack up and down, how do you store petabytes of high-frequency sensor data, all the way up to the minutiae of the interactions that people expect out of our front-end products, because you have to consider those who are thinking about how the data is stored and moved around. I've been doing that over the course of my career. I love tooling. I love infrastructure. I've been lucky to be part of some hypergrowth organizations. I'd spent some time at Vercel, working on their cloud infrastructure and how do you make very low latency web experiences for people across the globe. That team was very fun. It was four people when I joined, and we grew to 40 in under nine months. In many ways, not my first rodeo, but gotten to do some different things over time.

[0:03:31] KB: Awesome. Let's maybe start with a quick overview of what Nominal does for folks, and then we can dive into the technical guts of it.

[0:03:39] JH: We try to help hardware organizations move at the speed of software, and we think that the right way to do that is to build a connected suite of software products specifically focusing on managing data. We talk about the hardware data supply chain. There's a physical sensor somewhere in the world, and fast-moving modern hardware programs. Often, you need to get that data in front of many, many different eyeballs. They're working with really high frequency multimodal data. You can imagine a million data points per second are being produced on something like a test aircraft. You want to correlate that with video, or you might have some sensors, or other data sources that are on the ground while the airplane is in the air.

Getting that all to work well together is a software problem. It's a human and a process problem. As software engineers, we've built amazing tools for ourselves. We're very used to an elastic, cloud where you can run infinite CI/CD, and the observability that we are used to is just not there in the hardware world. Some companies, like SpaceX and Tesla, and then Anduril, one of our customers, have done really good stuff here, but there's this wave of smaller companies that are trying to be really innovative at hardware and move fast, and they really need help from software engineers and people who have spent a lot of time there.

[0:04:52] KB: Most of the folks listening to this, we are software engineers, right? We know maybe what this looks like in the software industry, but have no idea what the state of the art is, or what the norm is in hardware engineering. When you say like, helping hardware move at software speeds, what is the traditional hardware speed look like and what are the pieces that go into that?

[0:05:12] JH: Yeah. When my co-founder Bryce and I were getting to know each other, and he was teaching me about like, what, he had done NASA mission operations. We have sent a probe. We've spent over a billion dollars of tax pair money to Jupiter. What is it like to deploy software procedures to that probe? More importantly, test them ahead of time to make sure that you don't crash, or destroy the mission. My mind was immediately - I started geeking out, because I realized that when I think of CI/CD, I think of these again, you're doing a unit test. Maybe a more complex integration, or end-to-end test would take minutes. But when you're talking about something like a satellite, an airplane or a spacecraft, you might spend a million dollars on a test bed, a physical table where you're arranging batteries and wires, and you need to pre-test before you test on that test bed, because if your test bed breaks, you might only have one or two of them, because they're so expensive and your whole organization grinds to a halt.

The heterogeneity, I have trouble saying that word, but it's not homogenous the way you do this, something like a CI test pipeline. Immediately, there's a lot of need for interesting tooling there. Just to ground the conversation, I always like to talk about test aircraft. It's really easy to talk about. A lot of our customers, they fly something in the air. It lands. You need to get the hard drive off of the airplane. You need to somehow get that data to an engineer who designed the system. If there was an issue, you need to debug it. Some of our customers, that process could take on the order of literally a day or two, versus where you stick that feedback cycle being so much tighter as software engineers.

[0:06:46] KB: Yeah, it's wild to think about. To get your test data back, you have to literally transfer the hard drive. Then to run another test, you're either flying a plane again, or you're configuring some sort of hardware system. With that as our grounding statement, hardware has these physical constraints. What can you actually do with software to make that better?

[0:07:11] JH: Yeah. Again, just to bring us back to that example of the airplane lands and now you're getting the data off of it, a really common stumbling block we had seen is that the schema for the data that you're taking off of the hard drive, and now maybe trying to get into some standardized long-term format, it can change as the software you're deploying to the aircraft itself changes. In the field, you might have someone who's really tired. They're trying to get home at the end of the day and they're not a software engineer. They're not an engineer who designed the physical system. They're an operator, or a technician and they might forget to change a config value in their ground control software when they, let's say, they downgrade the software because there's an issue. Now you need to downgrade the configuration. If you don't, the data gets garbled and all of a sudden, your back to this really hard human process of ever trying to recover it.

That's the stuff that we realized was happening in some of our customers. One of them said, "Hey, all of the data ends up in this network storage. Help us dig in there." We realized that only 10% of it even got to that point, because of these earlier pain points. All of that is something that we think that we can help with. In that example I just gave, we actually built a better software tool for the test operators. Then, they didn't even have to think about this, but one of the side effects was that that config value was getting automatically updated, because we were syncing the version that's running on the aircraft with the version of the partial that was running locally.

Then, instead of them having to upload a file and do this process, again, that was happening in the background. That's a more clunky example. I think same organization, but now let's say they're installing a Starlink receiver and transceiver on the aircraft. All of a sudden, they're going from a, the data is getting ripped off of a hard drive at the end of the flight to it is streaming. Welcome to 2026. But that can be a really hard architectural change to navigate if your software resources are focused on the guidance and navigation and control of an aircraft, not the data processing. But we help tons of customers to that transition. Our architecture's designed to handle it. It's a fun and interesting software problem to solve, but that's the thing that we can work with them on.

[0:09:20] KB: Yeah. It's fascinating, because you're almost having to be vertical integration across all these different levels from the just making it easy to collect the data at all, the automated configuration change, suddenly, the data is actually getting through without a person doing that, to like, hey, you can actually stream this data. Let's help you handle that. Completely change the way you think about data. I guess, the question would be then, what are the problems that you end up having to, like you said, we're going from, essentially, batch hard drive data to streaming data. You have to support that whole gamut. What does that end up looking like in your architecture? What did you have to do to do that and what does the end user perceive as they make that transition?

[0:10:06] JH: I would say that as with any startup, we felt like over time, had to pick and choose which things we handled really well and which things we do a passable job at, so that we earned the right to do a better job down the line. Our original streaming architecture, I'm going to be honest, was pretty weak. We built a proof of concept that we could bring to our customers and open their eyes to like, "Hey, you are thinking of post-test data analysis in a certain way, but we actually think it could be this new world where you're doing the analysis on the fly." Some of it's automatic, while the aircraft is in the air.

Then one of our customers said, "We actually want to have Nominal shown in our control room." It's going to be like a human is looking at a dashboard and there's a aircraft and there's a person in that aircraft and we want to know very, very quickly if a safety check is failing. We searched on that as an engineering team, because we were pretty confident we could get the latent C2B within those bounds and I guess, it was a happy accident that the more analytical user interface we had built was other users were starting to look at it and seeing as more a monitoring dashboard. I would say, now we're in a place where those two use cases are both supported and very crucial to our customers.

Again, it's like this, you imagine you have a control room, you could think of it like Houston mission control, you're launching something, or there's some test and there's humans looking a lot of screens, and then there's the very like, your propulsion engineer who's digging through hundreds or thousands of flights, trying to find statistical patterns. But from an architecture standpoint, we've had to invest in, you can think of as a hot path and a cold path that need to stay in sync. They can't be telling different stories. But that really, really low latency pipeline and the more cold storage, or I need to do OLAP style analysis, we support both of those, and are trying to almost abstract away from our customers and our users the fact that from a software standpoint those are really very different. We want them to write the logic once and be able to let pivot to the different use cases seamlessly.

[0:12:06] KB: It's interesting, because in one of those cases you are in a traditional development QA cycle, or post-mortem cycle, where you're like, something went wrong, or something happened. I want to look at it differently. The other one is sounding to me like live observability. Let's have health dashboards. Let's have go, no go decisions happening.

[0:12:26] JH: Yeah. It's crazy the amount of data that people will cram into one Nominal screen. We've had to adjust parts of our user interface, because we didn't necessarily predict like, oh, it looks more like an excel spreadsheets than a web UI, but it's just because these systems are really intense and the data density becomes very valuable. It's a classic lesson in user interface, but if you have an expert user who's using this thing day in and day out, it's very different than I'm designing a web UI that someone might only - there's a funnel that I need to get them through and it's the only time in their life they're going to perform it.

Then, I think, just to highlight some of the software fun of this is we deploy off cloud, so there's customers where they need everything to be running locally. I can talk about real time. That's a interesting topic for us to get into later, but without getting into real time, like for these low latency control room workflows, you might not want to have a network copped AWS involved. But there's customers that they're using Starlink, they don't want initially administer their own stack of data software, and so they are using AWS. For us to get data going to AWS and then rendering back into React application all with very high volumes of data with tight SLAs around latency, that's a really hard software problem that we've invested a lot in, and we think it makes sense for our market.

[0:13:41] KB: Going into that, the volume of data here and thinking about - well, I guess, maybe first, let's start with like, what does the data shape look like? Because when we're talking about observability, this isn't stack traces and postdoc things. This is something else going on. What is the shape of the data you end up working with mostly?

[0:13:59] JH: Yes. If you think of a satellite, or an aircraft, especially one that's in test operations, there could easily be 10,000, 20,000 sensors on that asset. Each sensor might be kilohertz, might be 10 kilohertz, sometimes we see megahertz sensors. That's 1,000, 10,000, or 1,000,000 points per second. Depending on the test that's being performed, you might never want to drop a data point. The example I always give is like, if you're building a rocket engine as a startup, you might spend years and tens of millions of dollars getting to the point of doing your first rocket fire test. Even if that rocket fire only lasts eight seconds, or something, but you never want to throw any of that data. That's your entire company is essentially, that data asset that you got from that test. That is super different from software observability, where you tend to forget everything a week later. You're aggregating things into 10 or 15-second buckets and getting some stats on them.

You're thinking about things like P99. I always joked that as a software engineer, I was trained to think of my infrastructure like cattle, not pets. But for our customers, everything is a pet. They will have one or two aircraft that the entire company is oriented around, until you earn the right to scale up production. But yeah, so it's tons of time series data that video feed can be really important and doing frame precise correlation between video and time series is actually really, really hard. That's actually, I'll be honest, we underestimated the challenge of that at one point in our journey, and I'm very grateful to our - we have wonderful software engineers at Nominal. Some of them became experts in video encoding and how do you actually build a video stack. A lot of open-source software is not designed for that frame seeking use case. It's designed for social media video playing, or something like that. That's the shapes of the data we see.

Then, I'm just trying to think of like, I already talked about this network issue, but you might have radio data, you might have the data that is coming off at the end of the test off of a hard drive, you might have the streaming data, getting those all to tell the same story at the end of the day can be a challenge.

[0:16:03] KB: I want to dig in a little bit to, what are the differences implied by these companies that are working with pets, rather than cattle, where everything is tied into this one physical thing, or two or three physical things that may be subtly different? They may be nominally working towards the same endpoint, but each one has substantial changes. Are there versioning implications? What does that do to the development process?

[0:16:30] JH: Yeah. One of the interesting stories I heard recently is this one company when they detect a problem in a component, they want to trace it back to which 3D printer printed that component, and they literally named the 3D printers. It's because the other ones printed by that printer might have had issues around that time. It's this really interesting cataloging problem. A Nominal, we talk a lot about the hierarchy of assets. But you might have two aircraft and each one might have two engines, and at some point, you might actually swap out one engine from one aircraft to another. I'm used to thinking about Kubernetes and your nodes and your pods and things get shuffled around all the time, but understanding which engine. Was it engine 2, or engine 3 that was on the left side of aircraft 100A last week. That stuff is make or break.

It can be really, really hard to manage. I think some organizations, they do have to figure out how to treat some things more like pets for their own sanity. But the dollar numbers are just so different. If you launch a satellite and it fails, that's months of timeline and tens of millions of dollars. You're willing to double check things a lot more than, I think, people in the software engineering world are used to.

[0:17:44] KB: Cataloging is an interesting domain. Is that something that you do as a part of your software as well, or you provide them tools to integrate their catalog with the test results, or how does that end up working?

[0:17:55] JH: Yeah. One of the things that's really crucial to our data model is this concept of events. We are basically, through our user interface, encouraging people if they identify a region of time where something has happened to tag that as an event and then it's available to anyone else who's looking at that. They could be looking at a different subsystem, different set of sensors, but still be aware that something was going on in the overall system at that point in time. Then, a lot of these organizations, you as the responsible engineer who's designed a system, a test happens, you want to be yourself, eyeballs looking at the data feeds to understand did things perform as expected, or is there something anomalous happening? Then over time, you want to encode those into automated checks, so you're writing code, or light-based logic.

By the way, a story I always tell is that from the earliest days of our product, you could take these raw sensor feeds, like I was saying, kilohertz sensors. You've had 10,000 of them. You're doing math on them. These people, they do physics every day. I haven't done physics since I was an undergrad and it makes me sad. I miss it. But I ask them, this is a satellite customer that we were working with early on. I asked them, what type of mathematics do you need to do in our platform we support? Obviously, basic arithmetic, but we're trying to understand how wide our library scope needs to be. They were like, "Oh, nothing too complicated. Just quaternion operations." They weren't being -

[0:19:14] KB: Oh, is that all?

[0:19:15] JH: That's all. Yeah. It makes sense, if you're doing guidance navigation and control for a satellite, that's the table stakes. If anyone has used MATLAB, that is a useful thing to have in the back of your mind, where you think about the capabilities of our platform. Our goal is not to 100% replace MATLAB in an organization. There's always going to be certain types of modeling, or calculation, or custom code that our users need to write. But maybe the one person who's writing that at MATLAB script can package that up and it can be running and orchestrated in our platform. The other 200 people who might need to take advantage of that logic, they don't themselves need to open up MATLAB, or definitely not need to understand the code. I forgot the original question.

[0:19:58] KB: I was just looking for how much of the categorization. It sounds like, from event standpoint, you can correlate events, which makes sense. If you're looking at time series data and you say, "Oh, something went wrong and I don't know, this engine over here," that's probably going to impact a bunch of your other sensor time series. You want to be able to see that. But you also mentioned this example of tracing back the hardware to, for example, the 3D printer that printed it. It made me think of data lineage problems. It's not a data lineage problem. It's a component lineage problem. This piece was printed here and assembled here and connected here and being able to track all that back. I was curious if that's something that your software is supporting, or that's a third-party thing that they're doing independently, or how does that work?

[0:20:42] JH: You're right on the money, because I always talk about data tagging as well. Yes, it's cataloging. It is tagging. It is very similar to data lineage. You might have your left wing and your right wing, or you will have these arrays of sensors that are themselves modules that you're printing multiple times onto your large system that you're developing. You want to develop logic for checking each of those subsystems and then you want to repeat that logic like a software program. But doing that is really hard, because you have to have a schema to your sensors that allows that to be possible. That schema will change over time. It's the types of things that, again, people who are building for a specific software system are used to solving those problems, but we're trying to build a generic platform that can work for someone building an aircraft, or someone building a nuclear reactor.

We think about tagging of time series data in such a manner that you can make those composable. You can have logic, you can compose it, and then you can apply different things. We think the cataloging is really important, and some of the work we do to their customers is helping them, whether they're a five-person startup, or a 5,000-person organization that's starting a new program, how do you be thoughtful about that cataloging from the beginning? Really powerful example, but I think it's really grounding is if I am doing a test and something goes wrong, I want to compare it to the simulation that we ran that's supposed to be corresponding with this test. How do I summon that data and overlay them on top of each other really easily?

If you've tagged your data well, that's just a couple clicks. If you haven't tagged your data at all and it's just in a big soup, you're writing a lot of custom SQL. That's what we're trying to get our users away from is the same way that I use Datadog and the Datadog agent is cleverly tagging many aspects of my data based on the node and the pod and the other infrastructure things that it can introspect. We would love to get our users to the same place. For us, that means investing more in the ingestion agent code that's frankly running closer to the edge, or even on a hardware itself to try to make that a zero-cost abstraction for our hardware engineering users.

[0:22:45] KB: That's super interesting. Two different dimensions I want to go down on this, based on that. One, I'm going to go down a little bit is, so you mentioned using simulation and comparing simulation and hardware. That's, I think, a space that, I think it was last year I was at a panel that was hardware, or it was automotive manufacturers. They were talking about, this was one of Tesla's big advantages is that they have all of their stuff in simulation. They can do it. Now, the traditional manufacturers are trying to get more and more of their things in simulation environments, so they can iterate and improve faster there. Is that a use case that you are directly supporting, so you're doing your test infrastructure for here, like get the outputs of the simulations and then do it with your hardware stack, the live version later, or is that, once again, like an integration point that you're integrating out to something separate?

[0:23:31] JH: I think for us right now, it's an integration point. We don't claim to be experts in simulation, but we really do want to make simulation much more leverage within our customers. If they are running simulations, so many of them run into this chair-stool problem of like, I need to have three tools up to be able to make the correlations that I want to, or worse, I need to be reaching out to another team. I think the world is changing, and we 25 years ago, true hardware innovation was pretty localized to large conglomerates, traditional primes, companies like General Electric that have massive, massive resources. It made sense for those companies to develop over the course of 30, or 40 years proprietary best-in-the-world simulation software. But as more of this innovation is moving into smaller, more insurgent companies, they need to grab some stuff off the shelf and try to piece it together.

I think that is itself really frustrating, so where we're trying to meet people as close to where they're at as we can. Hey, if you're using the simulation tool, let's figure out how to get that data in Nominal, like I was saying, tag it, and catalog in such a manner. To us, it's just data. It could have come from a simulation. It could have come from a hardware test, but we'll oftentimes talk to leaders of test organizations that have this dream of how we're always simulating, and if something goes wrong, we're going back and learning about how to improve the simulation. But their organizations can really easily not meet those aspirations, because of tooling fatigue. That's what we're trying to solve.

[0:24:57] KB: That makes a lot of sense. The other direction I wanted to go down is you talked about having agents running at the edge, at the sensor. I know in software cloud observability, oftentimes, a lot of what those agents are doing is filtering out data that is not relevant to send upstream, so they reduce the quantity. But to your earlier point, the amount of data that you want to filter in a hardware context may be much smaller, maybe very different, particularly, I love your rocket ship example. You have eight seconds of data and that is the value of your company right there. That's everything you've invested to get. You don't want to throw that away. What exactly are those edge agents doing? Are they still doing some amount of filtering, or what happens in that case?

[0:25:37] JH: Yeah, they are still filtering, because of network limitations and the fact that software is not 100% reliable. But they are then having to save the data locally and then figure out a way to backfill later. That's something. We have a customer right now. They're doing really intensive testing in a network-constrained environment. It used to be that there was massive latency between each queue headquarters reviewing the data and what was going on the test site, because of that network limitation. We have now made it so that, like I said, the logic can be composed in one place and it can be pushed to the network-constrained location, so some of the checks can be happening automatically in a very low latency manner.

Then the data is getting uploaded at the rate it can, but there's this mesh concept of you might have your HQ and your test site each have a database and they eventually get in sync with each other. Again, to use the rocket fire example, the eight seconds tons and tons of data, but they're not doing that constantly. They might only be doing a few times a day. So, if you amortize data upload over long enough time horizons, you can catch up. To your question, it's like, people who have thought about things like back pressure and distributed systems, super, super relevant for our use cases. You oftentimes are having to prioritize between safety critical information, high-priority observability, and then everything else. There's a hard drive that you're having to store to and then catch up later. It's not dropping the data. It's just figuring out how to handle it, given the constraints.

[0:27:06] KB: Nice. I'd love to dig in a little bit to your topic of the way that hardware innovation is shifting. First off, just open question. You said, 30 years ago, there's all these big companies, now it's much more insurgent. What are you seeing in terms of hardware innovation and the ecosystem? How is it changing what's happening there?

[0:27:25] JH: Well, it's really interesting, because I want to give some credit to some of the larger incumbent players. But the Air Force being one of them. We work with the Air Force. They employed 20,000 test engineers. Speaking of pets versus cattle, as taxpayers we are all shared owners in arc jets and wind tunnels in these, they're literally, they're called the national labs. There are these assets that are crucial for testing. Whether those can operate well or not makes the difference between the US falling behind, or staying ahead of certain international competitors. That's still a thing. With us investing and partnering with those organizations as the software and tech industry, I think, it's really important.

But then, we have also tons of venture capital dollar and tons of brain power trying to do things in new ways. Frankly, SpaceX has trailblazed here and shown that it's possible. What's both amazing and terrifying is that these startups are trying to go even faster than SpaceX. They're trying to do more with less than ever before. So, it's aggressive timelines, it's lean teams. I think that's part of why we started Nominal is if every single one of these companies is going to be finding themselves having to set up AWS infrastructure and then build some custom tooling on top of it, that's just not what makes their beer tastes better. We should be partnering with them and helping them do that, and we're seeing that happen. We have startups where this was not my prediction when I started Nominal, but they want to use Nominal and they reach out to us and we get them spun up and we send them some documentation and they're ingesting the data and running, without us even having to go on site, which is amazing.

These are seven, or eight-person organizations. We have these metrics where we keep track of what percent of the engineers of the company are logging to Nominal on a given week. For some of the startups, it's 100% the CEO is logging in, because he or she wants to understand how their system is performing week over a week. Then, I think it's incredible. I don't know if when I was growing up, I would have thought of a nuclear reactor innovation being able to be carried out by essentially, a startup, but we're seeing that more and more. It's awesome.

[0:29:29] KB: It's fascinating, because I feel like, in my career, roughly spanning since early 2000s until now, we went through a period where it felt like all of the fast innovation was happening in software. Software was changing quickly. There were software, software, software. Hardware was slow, moving slowly, not that much change. There was change, but it was like, "Oh, my car got better miles per gallon now," or things like that. It's not super visible. Then it definitely feels like, in the last few years, suddenly, we have private space companies and we have all these drone changes and all this other stuff going on in hardware that it's wild.

[0:30:06] JH: It really is. We have to ground that in, there's still different constraints building and innovating in hardware than there is in software. That's one of the things that I get nervous about sometimes, especially in the last couple years, where software engineers are having agents write code and the agents can run these evaluation loops and there's this feedback loop that's again, it's just so fast, and so intense. A lot of times, the supply chain is the supply chain in the world of hardware and you can't shave off these months, other than by designing things well from the beginning and having an organization that is geared towards iterative development and not waterfall style development.

If you look at a program like Starlink, they have multiple generations of Starlink design orbiting the planet simultaneously. They're constantly getting data off of the satellites and feeding them into their manufacturing improvements. That's the way every mature hardware program needs to shift to. You referenced Tesla and their simulation. The amount of effort that Tesla has put into their internal data cataloging, I think, would blow other people's minds. Apple, not an organization that's super public about the way they handle this stuff, but absolutely massive investments in test infrastructure and test data management. But the truth is that, I don't know if it's because of 3D printing, or just the fact that so many people have grown up inside SpaceX and now they can go off and build their own companies, and there's this culture of confidence in doing things more quickly.

Yeah, we're in a golden era of that innovation. As a software engineer, I'm a little bit - I get FOMO sometimes. I still love building software, and so I joke that Nominal is the closest I can get to contributing to all of this, while still myself, waking up and thinking about code every day.

[0:31:45] KB: Let's go down that, because you mentioned the coding agents, the coding agent loop. That's totally transforming software development at the moment. People are really trying to figure out what it looks like to be a software engineer, looking out a few weeks, or months, or I think years is too hard at this point, because stuff is changing so rapidly there. You had a public LinkedIn post about it's not yet doing that in hardware. AI is not changing hardware engineering to anywhere near the same degree. Can you expand on that, and maybe we can dig into some of the whys behind that and what it would take to actually accelerate, or expand it in that way?

[0:32:20] JH: One of our customers that I can't give specific details about, but they're trying to build a machine that's the size of a building. They're going as fast as possible and it's frankly the fastest I've seen any organization go on. It's still taking them months. But because of Nominal, like this with their CEO and says, "We are looking at data and we're noticing problems and we're catching them earlier. It's taking us 30 minutes to fix them, instead of something catastrophic and us losing two days." That is the tightness of loop that I still think, that's an order of magnitude that we're providing. But it is very different than, I write some code and then I run a unit test and all of a sudden, now the code is writing itself and running its own unit test, which is it's an order magnitude, but a micro-thousand scale.

I really want, I think as a as humanity, we should have the ambition of getting to that in the hardware world, but that requires this concept of like, well, the hardware can be tested in a unit testable manner, and that feedback loop can then introduce the opportunity for something like a design agent to understand what's working and not working in a hardware design. There's aspects of this. I think there's amazing companies that are - you can give certain parameters and it will spit out a proposed design, and then there's simulations that you can run on it. But at the end of the day, you still have to physically build something and wire it up, and maybe you're driving it out to a desert and flying it. Those things just take a lot of time and human coordination. There are safety risks involved, and I just think we're willing to be pretty YOLO with OpenClaw, but we not want to with a rocket fire, or especially any flight that has a person inside the vehicle.

To get there, we need to have some place for the data to be aggregated and for people to give thumbs ups and thumbs down on that data. Talk about the event catalog again, but if you have a sensor that's at a rocket test facility and again, that rocket is only being fired for a few seconds at a time a few times a week, the rest of the time it's sitting dormant. You don't want to drop any data points. The eight seconds of tens of millions of dollars investment, that golden data asset, you have a lot of dead time in there that you can delete and filter out. Understanding where those regions of interest and disinterest are having the people who got the degrees in physics and mechanical engineering and are working on top of a platform like Nominal and telling us where the interesting data is, like what's anomalous, what's expected, what's anomalous but fine versus anomalous but dangerous. These are the types of data assets that we need to be accumulating if we ever hope to train agents on the world with physical engineering.

[0:34:52] KB: Yeah. I think what's interesting, one of the reasons that software is so amenable to agents is because you have that very tight feedback loop, and you can give the agent the ability to just try shit and then see what happens and iterate in feedback, because they're probabilistic. They're not going to get it right every time, but if you can do that. Looking at the hardware world, some of what I'm hearing is that feedback loop is harder, because there's a human - you got to print a thing and do a thing. The simulation solved this, can we get, or a simulation not good enough? How do we get to that level of an agent can have a hypothesis, try a thing, get a feedback loop and then iterate on that, without a human having to be in that loop?

[0:35:31] JH: This is where we're getting above my pay grade. But it's like, the simulation is not good enough and I almost go back to being a curious college student and thinking about these things. I start to wonder, is it actually better to send the agents thinking at the level of physics and simulation and building up from there, versus trying to give them simulation tools and have them jump in at the hardware design layer, like the quip bio is share is that if you ask an LLM about the way a jet will perform, it's not thinking in terms of physics. It's thinking in terms of humans who have translated physics into English, and that it is itself composing the concepts of English summaries and synthesizing some results to you, and then you have to go test if the physics end up matching that or not. That is not good enough to actually make progress. Very different than software where you're working in code, which they natively understand.

[0:36:24] KB: A thing that I have observed is LLMs are systematically bad at a number of things. But one of those things is time. You talk about you're doing a lot of stuff with time and time series data. Is there an LLM component to Nominal software, or do you cut that out and not deal with that at all right now?

[0:36:40] JH: Oh, yeah. Thank you for asking that, because I should be really clear. We released just a few weeks ago, like all of our customers have access to it now, but you can interact with a workbook agent. You can have a chat interface. The reason we added that is because it actually reduces a lot of friction for our users. Because we have to support so many complex calculations and visualizations inside our platform that I don't know, it can take a hundred clicks to build up a visualization of what happened during a flight, doing a flight test analysis.

If you have done this before inside our platform, there's certain equations that we can communicate about in English and there's enough literature on the Internet about what that means, that the agent can get right, understanding like, okay, well speed is the square root of the sum of the squares of the component speeds. You can ask questions like this and it can perform a lot of clicks for you. That's something that we've built in right away and we think it's really powerful. Then, we honestly have a ton of stuff in our roadmap where we think there's more ways to reduce friction in our platform leveraging these technologies, but then, also, probably some exciting thing that we can do in terms of training on time series and using these architectures to help our users identify things that are interesting that even a large organization of hundred or more engineers might miss. Lots of research that we're doing there and excited to see where it goes.

[0:37:56] KB: Yeah. I mean, there's a lot of interesting traditional ML that you could do in that space. One of the often-overlooked side effects of the boom of LLMs is we now have ML on APIs. You don't have to be an ML expert to start testing with and tinkering and training and all these things.

[0:38:12] JH: Yeah, totally. I saw a really interesting talk one time that was just reinforcement learning was just getting going and now everyone's been distracted by chat bots for a couple years. I'm excited for when we go back to reinforcement learning. But it's similar to your question about simulation, where we really want our platform to be able to support, hey, this organization is plugging into an ML vendor, or an ML consultant, they've hired some, whatever. Or, there's a feedback loop that you need help standing up where you might have the outputs of a test and want to be feeding them back into a model, the model is getting updated, that's going back to another team, who's doing parameter design.

At some point, we are going to have to vertically integrate there. I think that our customers tend to lead us in the direction of asking for more technical expertise. But at the moment, it's like, if you're training your own model, if you're bringing in something like Anthropic, whatever it is, we want to as a platform player be a good integrator with all of those.

[0:39:05] KB: I guess, that leads to the question of where do you see this going for yourself, for Nominal, but also, just this broad space of hardware development and test over the next few years? It's gotten almost cliche in software. I try not to project years anymore, but hardware is still slower, so I'm going to push you to project out a few years, not just a few months, like we do in the LLM world.

[0:39:28] JH: I think we should be projecting years in the software world, too. I think we shouldn't give ourselves that pass. But there's a lot on our roadmap and a lot that I see our customers doing, but one of the things that we are working with a few different partners on right now is how to be more efficient with testing. I think in a way, that's very natural to develop these regimes of intensive certification and testing that are very expensive, both from a time and a dollar standpoint. With the advances that we've been able to achieve in machine learning and in building models of things like aircrafts, we can actually just do, as you said, good old-fashioned data science about like, hey, how do we achieve the same level of confidence, or same level of statistical inference with 100 data points, instead of a 1,000, or a 1,000 instead of 10,000?

We have some partners at the Air Force who dream of a pilot who's up in the air performing a test flight, being able to get feedback from something they did five seconds ago and informing what they're going to do.

[0:40:24] KB: Oh, that's fascinating. You get to real time and you're like, "Hey, we saw a little bit of a flex on this. Can you try this maneuver and see what happens, or something?"

[0:40:31] JH: Yeah. If anyone knows what a kneeboard, it's literally a clipboard that you have almost duct tape to your thigh, that's like, these are the 10 things I should do while I'm up in the air experiencing G forces. It's very hard to do what these guys do. It's incredible. But how do we make that an iPad and how do we make it so it's not predetermined, but dynamic? That's a dream. I think the same principles can trickle down into lots of aspects of hardware development.

We see vendor-purchaser collaboration potentially happening. How do you reduce redundant testing, or increase the fidelity of the way information is communicated about what tests were performing, what the results were? As someone who's worked in enterprise data, I love this stuff. I think it's incredible. Some people maybe find it boring. But if you boil it down to the software problem, it comes down to these things, like asset hierarchy and catalog and data lineage that we touched on earlier, which are just so, so crucial to these organizations functioning well. Then, it's this increasing drive to integrate the different roles in the design prototype development, initial manufacturer tests and then scale manufacturing.

I think a lot of these aspects of hardware development are going to be changing over the next couple of years, like the way you lay out a manufacturing facility. It might be done by AI, maybe. That's an area where I know some companies are spending time. As those things speed up is that shifting bottlenecks to other parts of a hardware organization, or are you changing some of the paradigms? For us, we bet that sharing a single data substrate across all of that is going to make everyone move faster.

A really easy example to talk about is it used to be that 100 people were focused on one test asset. If you look at the way companies like SpaceX operate, they think in terms of, how do we get it so one engineer can monitor 100 assets, inverting that and getting much more efficient. But we all know from the world of software, when you give people that level ownership, they actually develop more understanding and they can come up with innovative changes that would otherwise be lost in bureaucracy.

[0:42:33] KB: So far, we've pretty much entirely talked about what I might talk about mega scale hardware, like rockets and jets and all of these things. Are you working with smaller scale stuff, drones, or internet of things type stuff, or other types of physical devices out in the world?

[0:42:51] JH: I would say, that not micro scale things. We really think that our software product can be applied in manufacturing and operational contexts, where the devices themselves get smaller, but the scale, the number of them gets much larger. I would say that there are drones that we work with that are not mega scale, and where a lot of the interesting part comes from, it's not that you're flying once or twice a day. It's you have hundreds of them that are flying dozens of times a day, and that data challenge gets to be quite fun.

[0:43:19] KB: I was going to say, how does that change the shape of the problem?

[0:43:23] JH: It changed a lot, because large, mega scale assets if there's a single big problem, you might you might pause operations and be like, "We need to figure out what happened there before we fly tomorrow." Versus, it becomes more of the cattle than the pet. Where it's like, okay we had a thousand problems yesterday. What's the one that's the most important first to drill into? When I think about the future of something like drone delivery, it's probably - it doesn't need to be a 100% accurate in the way that something like, whatever the new aircraft that Boeing is developing for commercial flight needs to be, but it needs to be 99.9% accurate and that's more of an economics problem than a pure engineering problem.

We talk about fleet monitoring and fleet observability. Then it's an interesting problem, because it translates from not just development and tests, but also to operations and maintenance and again, it's the logic that is being defined by the engineer who's first testing something, versus trying to figure out how to ramp up production, versus like, hey this satellite is orbiting the earth, or this drone is doing dozens of deliveries a day. It does not make sense to have to re encode that logic in three different tools. That's again, part of why we're starting Nominal is we think that can all be frameworked away into a single platform. We are really focused on testing right now, but excited to get more into operations and manufacturing in the future.

[0:44:40] KB: Well, and that does end up smoothing your pathway a lot, right, instead of having to, as you get end up tool soup. Okay, it's time to operationalize everything. New set of software, got to figure it out. Well, I think we're coming close to the end of our time here. Is there anything we have not talked about yet that you think would be important to discuss before we wrap?

[0:44:59] JH: Yeah. I think one of the things that I find exciting right now at Nominal is we have this organic user growth in some of our large organizations that, again, when I was designing Nominal, I was not thinking about the seven-person startup, where the CEO was using the data. I also wasn't thinking about environments where we don't even necessarily know the users, but they're sending each other links of like, "Hey, I did this analysis in Nominal. Can you check it out?" We're at a point in our growth where we're starting to have dozens of people log in our platform in a given week that we haven't met before. With that, comes data scale challenges, and we're also on the side, like I said, trying to research how AI can be incorporated into our platform. I would say, across all of that, if you're interested in picking up your software knowledge and walking across the aisle to the world of physical engineering, come talk to me.

[0:45:48] KB: Yeah, for sure. I think the AI integration is an interesting one, because one of the things enabled by LLMs is what I call intent-based UIs. Traditionally, if you're setting up a data dashboard, you need to understand a lot of the nuts and bolts and you need to drive through and say, "Okay, this is where I'm getting this data and I'm putting it here and I'm putting it there," and do a lot of imperative design of your data dashboard. Do this, then do this, feed this to here and go here. Whereas, with an LLM, if you understand broadly the shape of the data, you can just say, "Hey, I want something that maps a bunch of these inputs to some useful dashboards that I can watch," and the AI can just do it for you.

To your point, you can open your user spectrum to folks beyond the initial experts, who know exactly how to wire A to B and say like, just those people who know what data they care about, or even just what problem they care about, let the AI figure out what data is relevant for that.

[0:46:44] JH: Yeah. If I can riff on that for a second, it's like, I'm so excited about our platform play for exactly the reasons you're saying, where there's a lot of custom and expensive user interfaces that we wouldn't have been able to build in the past. But now our users can build it for themselves. It opens this aperture. I think the challenge then in this conversation we're having our customers is like, well, but then when it comes to tagging, cataloging any data right back, something like the events that you're trying to close into a valuable proprietary data set, it needs to converge. How do you give people the tools, or give, if it's agents building the software, how do you give them the tools to allow for that intent-based user interface, but then, I don't know what's the right term, the organization needs to actually agree and there always need to be laws about like, well, okay, how is the data managed in a way that's not just individual, but organizationally beneficial?

[0:47:34] KB: Yeah, absolutely. It's an interesting time, because with great power comes great responsibility, right? If everybody can vibe code whatever they want, then you're in trouble, because you get a lot of garbage in there. But if you can use that in a way that lets you empower people, but doesn't let them screw up the underlying pieces, doesn't let them make a mess, if that makes sense. Then, yeah, it's an interesting world we're getting into.

 [0:48:01] JH: Yeah. I think we're early in the journey of getting more eyeballs and more data within our customers and I think that these AI technologies will just actually accelerate that and that's just good for everyone.

[0:48:13] KB: Now, really quickly on that topic. There's an interpretability challenge, too, because it's very easy - what is the saying? Lies, damn lies, and statistics. It's very easy to get data that tells a story that may not be true if you don't understand how it's interpreted. I'm curious, yeah, as you think about those user interfaces, how do you make them interpretable by the end user?

[0:48:36] JH: I mean, that's something we haven't touched on. But since the earliest days of the company, my paranoia in designing your user interface was like, if you're decimating - you have a multi-hour flight and you're trying to plot it, and a lot of people do zoom in on regions of interest. How do you make sure that those zoomed out plots don't tell a lie and don't hide a data point that could be critical? Because it could just be a split second of behavior across a multi-hour test that is really problematic and indicative of something horrible happening.

Our users, they lose sleep, because they are like, "What if the data has a story being told in it and I just am not seeing it, and that's the thing that makes or breaks this asset, or system that we're developing?" Again, I think we right now at least have the benefit of our users have a culture of being super disciplined. As long as the tooling is not getting in the way of that, which again, we part as heavily at Nominal, then they will do the right thing. Again, not to make multi-year predictions, but I am a little bit worried about if culturally, we come too dependent on automation, or if the UI is being developed by an agent and there isn't that attention to detail that we have built into the product, do you start to have compounding misunderstandings? That would be terrible.

I hope we cross those bridges in less critical realms first. Whatever it is, there's going to be some outage in the software world and people can really talk about how they built better risk management around it. Let's learn those lessons in a less safety critical space and then bring them to aerospace and nuclear engineering.

[0:50:08] KB: You're saying, you don't want your rocket designed by OpenClaw?

[0:50:11] JH: Not yet. Not yet. But let's start with video games and then we can move on to physical systems.

[END]
SED 1937		Transcript

	(c) 2026 Software Engineering Daily	1