[INTRODUCTION] [0:00:00] ANNOUNCER: The availability of high-quality AI model APIs has drastically lowered the barriers to developing AI applications. These tools abstract away complex tasks such as model deployment, scaling, data retrieval, natural language processing, and text generation. Vercel has developed a complementary set of tools for building AI web applications, including their AI SDK, v0, and the shadcn/ui component framework. Ary Khandelwal and Max Leiter are on the AI team at Vercel. In this episode, they joined Kevin Ball to talk about the AI SDK, v0, shadcn/ui, and the AI tooling ecosystem at Vercel. Kevin Ball, or KBall, is the Vice President of Engineering at Mento and an independent coach for engineers and engineering leaders. He co-founded and served as CTO for two companies. Founded the San Diego JavaScript Meetup, and organizes the AI in Action Discussion Group through Layton Space. Check out the show notes to follow KBall on Twitter or LinkedIn, or visit his website, kball.llc. [INTERVIEW] [0:01:18] KB: Hey guys, welcome to the show. [0:01:21] ML: Thank you. Happy to be here. [0:01:23] AK: So excited. [0:01:23] KB: Yeah. So let's maybe start out with some intros. So actually, I'm going to go to Ary first. Can you introduce yourself? A little bit about your background? [0:01:30] AK: Yeah. My name is Ari. I was a computer science student at Princeton. And after graduating, I worked on a design-to-code startup that Vercel acquired around a year. And after that, I started working on the v0 team and I do product and engineering work there. [0:01:45] KB: How about you, Max? [0:01:47] ML: I'm Max. I'm a staff engineer on the AI team here at Vercel. And I've been on the AI team since inception, probably around two years ago. I was there for the AI SDK, I'm sure we'll talk about a little bit. And then I was there for v0. And before joining Vercel, I was an intern there back in 2020. Been here two and a half years now and then plus some change on the back end there. [0:02:06] KB: Nice. You sort of bring us into our topic, the AI team. Y'all, as I understand it, are both working on shadcn as well. Do you want kind of give us a little bit of an overview? What is this thing and what's the origin story? [0:02:21] ML: Yeah, for sure. Real quick, when it comes to shadcn and the shadcn components, we work very closely with them. Shadcn works at Vercel. But on the AI team, we're largely working alongside shadcn. We want the components to not be AI-specific. We want everyone to be able to use them. You don't need to use Vercel or any of our products. But we work very closely with the shadcn to make sure that they work well for v0 in our use cases. [0:02:44] AK: And the AI team itself has two different products that we really work on. One, which Max alluded to, is the AI SDK. The AI SDK is a mini-TypeScript framework that makes it really, really easy AI-powered applications. The goal is everyone is building AI applications these days. The AI SDK lets you switch between models really easy. It gives you access to utilities that make building basic parts of this application, like streaming utilities and things like that, super easy. You can really focus on the business logic of your application and not how to stream from an open AI provider so that it renders in the UI super well, which is a solved problem. And the second thing that we work on is v0. And v0 is a tool for all developers to generate UIs on the fly. And yeah, v0 uses shadcn components as a base component library to make those UI generations that it makes composable and reusable and actually built on top of a component library as opposed to being just spaghetti JSX. [0:03:43] KB: Got it. Okay, so let's maybe dive in first then to that AI SDK a little bit. You mentioned a few different things. Would you compare this to something like LangChain or is there another comparable that people who haven't used it should use to kind of get their heads in the right mode? [0:03:59] ML: I think it works with LangChain for one. We have a LangChain provider and we have some tools. You can use LangChain with the AI SDK if that's what you want. And it provides a lot of more core primitives you could use to build something that LangChain might provide for you. Because I think which is very common in software is you use a library and then you hit some opinion of the library developer and you're a little stuck. At the AI SDK, we try to be very unopinionated and a little more low-level so you can piece together these parts and make your own pipelines, your own LLM apps without us getting in the way of how you actually want to structure your program. [0:04:29] KB: Got it. Okay. And so looking at it, there's sort of a bunch of different pieces, right? There's like the core, there's the UI the UI pieces, things like that. If I were jumping into it and wanting to use the SDK, is there a mental model I should have in place for getting started with this? [0:04:45] AK: Yeah, I'd say there's two basic parts to AI SDK. One, as you mentioned, is Core. And Core is really the basics around which the whole AI SDK is built. The easiest way to think about Core is this is the way that you actually call an LLM. Instead of calling an open AI provider, or an anthropic provider, or whatever model you want to use under the hood, you call the LLM via the AI SDK Core and you pass it a model, which could be any model, across the different providers, across Anthropic, and OpenAI, and Lama, and things like that. And the goal with Core really is just how can we make it really, really easy for you to switch between models, test different things out. And if a new state-of-the-art model comes out, you should be able to use that out of the box without having to go back and rip out a provider and change a bunch of stuff. The second part of it is UI, and here we really focus on building reusable primitives that make parts of the UI very, very easy, parts of the front end that are typically required for applications very, very easy. The basic component there is a function called useChat, which does a lot of the streaming behavior for you. It handles like keeping track of messages and it handles like rendering those as they stream in. Together, they help you build AI applications pretty fast. [0:05:59] ML: And I think kind of an encompassing idea at Vercel is we're always building for ourselves and we love dogfooding. The AI SDK came out of us building the AI Playground, which is a web app we have on the AI SDK website that lets you try a bunch of different model writers all on the same screen. You have a bunch of columns. And then you're able to see the same response from like 20 different providers. And you can see how they all compare against each other. And we built this and we were like, "Wow. There's a lot of streaming code in here that's kind of tough to write." A lot of places you can mess up. A cool thing about Vercel is we have people here that know every part of the network or browser stack. We had experts coming in and helping us fix our streaming code. And then we were like, "Why should we be the ones with this? And why shouldn't we give it to everyone?" We pulled out the Core from the AI playground and made it the AI SDK. And since then, it's grown a bit. But I wanted to highlight on Ary's first point about Core, which is switching between models. Every model provider has their own APIs, and they've kind of generalized around OpenAI's schema. But they all have different implementation details. They all have slight different things that are really annoying when you run into them. The AI SDK covers all those. It gives you one interface to use all the different providers. And we also give you the ability to like, "When this model airs, switch to this model." Do a lot more advanced things like that, which are essential for production of all applications. [0:07:14] KB: Yeah. So one of the things that is different across all of these different models is that they expose different levels of completion. OpenAI, for example, seems to be standardizing towards chat completion. And they've done a lot of training. They're exposing that. They've been deprecating a lot of times some of the lower-level APIs. Some of the models you might use are just like full-on completion models. I'm kind of curious, do you hide that layer? Do you have an abstraction above it? And what tools do you provide to let people dive through those layers as needed? [0:07:43] AK: Yeah. I think the way we think about it is that the base completion, base inference, is usually shared across all the different model providers. They all provide some kind of endpoint to do completions. In addition, they also provide additional endpoints to do more specific tasks. The way that we've thought about it for the AI SDK is that some of the functions that you can use in the AI SDK are provider-specific. You still get access to the individual-specific things that individual providers are giving you. For example, some models let you generate images. Some of them don't. The generate image functions in the AI SDK can only be used with providers that provide you that image functionality. Our goal is that if providers come out with new cool endpoints that allow you to do new different things, we will still add those to the AI SDK. They might not be available across all the providers until each of them implements it. Typically, what we've seen is that the rate of convergence for what these endpoints actually are is very, very fast. When OpenAI comes out with something new, Anthropic, Google, et cetera, like all race to kind of provide similar functionality. And the same happens in the opposite direction of Anthropic, or Llama, or Google come out with something faster. [0:08:53] KB: Got it. Another question kind of related to this. For example, OpenAI versus Anthropic is a good example. They deal with system prompts quite differently. Is that something that if I'm wanting to be able to swap quickly between model providers, you'll take care of for me? How do you navigate that? [0:09:09] ML: We don't manage the prompt format, although it's something we're exploring. Anthropic is very strong about their models working with XML. OpenAI doesn't really say the same thing, but I think a lot of people use Markdown for their prompts. And we don't have a layer for doing that. But we do have the ability to tweak certain behaviors that they support differently. A really powerful thing you can do with models is called Assistant Response Prefills. So you can, in your messages list, give an assistant message, and then the LLM will pick up where that left off. If you wanted to write JSON, you could put a little curly bracket there, and that strongly hints at the LLM as you continue writing JSON. Providers all treat those a little differently. And the AI SDK works around those bugs. When you switch your provider or you switch your upstream, you don't hit those bugs. But overall, we try to keep those changes to a minimum. [0:09:56] KB: That makes sense. Now, as we talk about building applications with LLMs, there's a lot more than just the model-level interaction. And there's some amounts of - I don't know if I'd call it standards, but patterns that are emerging around like how do you build agents and agentic behavior? How do you handle memory and context loading and things like that? Is that something that y'all are tackling as well or is that deferred to the end user? [0:10:20] AK: Yeah, 100 %. The way we think about it is that there's two ways we could have done this. One is we build more complicated abstractions and force you to use those abstractions for agent behavior, for doing RAG, for doing basic things like this. We've purposefully chosen not to do that. And the reason is because we've found that our users typically care a lot about having the lowest level of primitives possible so that they can customize and do things at will. The way we really handle that is we provide you with recipes or cookbooks on how to put together different primitives that we've provided you in the AI SDK to do very common tasks. I think like RAG, and agents, and tool calling are three examples of things that we have very good recipes for. If you want, you can lift up our code snippets and just use them as is. But more often than not, what we're giving you this recipe for is that you have a good understanding of what was the mental model we had when we were building these primitives and how we thought about putting them together to do some of these more complicated use cases. And we found that this still allows you to have the flexibility to do whatever you want with the primitives, but still gives you an easy way to get started when you're like, "Hey, I want to do something complicated. I'm sure people have done RAG before. How do I do RAG with the AI SDK?" You can use our cookbooks and recipes for that. And one thing that's super useful for us is because v0 and AI SDK, those teams work very, very closely with each other at Vercel, a lot of the product influence for the AI SDK comes from use cases in v0 because v0 is built on the AI SDK. It's very frequent that Max will be like, "Hey, AI SDK team, we're facing this problem with streaming this kind of response on this provider." And the AI SDK can be very responsive and be like, "Oh, there's a bug there. We can fix it and update it." And a lot of the cookbooks and recipes we have are pulled out of patterns that we built into v0. [0:12:08] ML: And real quick, I'll hop in and say that a lot of what we released in the AI SDK is also pulled from v0. We build it for ourselves there, and we're like, "This is really good." We try to nail the abstraction, and then we try to share it with everyone. One cool example I really hope more people start using is auto-continue responses. LLMs have an output limit. And when they hit that, they stop outputting. There's no reason you can auto-loop back to the LLM and have it continue its last message. Now you can have really long code blocks or responses from LLMs with one line of code. And I think that's so powerful. [0:12:39] KB: Let's maybe, actually now that you're referencing V0 a lot, dive into v0. First, I guess from a user perspective, high-level, what is v0? What does it let you do? And then I'd love to actually get into the patterns you're using inside of it to build it as an LLM application. [0:12:53] AK: Yeah, the way we think about V0 is it's like an always-on pair programmer that's an expert in all things frontend. This means that it's like an expert on topics like React, Next.js, Tailwind, shadcn/ui, all of this part of the modern frontend stack. If you ask v0 very, very specific questions about new APIs released in Next versions, or how to migrate to server components, how to use all these kind of new features in React and Next, v0 is very, very good at talking about those. And the other part, which is what v0 is probably best known for, is part of building frontends is building really beautiful aesthetic UIs. And v0 is most famous for being able to give in a prompt, generate UIs that give you the React code built with Next, full-stack applications, that actually render that UI. [0:13:41] ML: And you can preview them on v0 in the web very quickly without having to spin up a dev server or doing these other things. [0:13:46] KB: Nice. Thinking about that, I'm going to dive into a few pieces there. You said, okay, it's very good at answering particular programming questions. What layers are you influencing to do that? Are you fine-tuning some underlying model with more developer-focused data? Or do you have a search layer that you're then RAGing that context into? What does that look like under the hood? [0:14:08] ML: We can't go into too much details on the specifics of the pipeline, but I think I can say that one reason we built the AI SDK to let you switch between models so easily is we don't want to be locked into a single provider or model. So much of what we've done on the v0 engineering side and what I think all good LLM apps do is the non-LLM engineering to make your results good. That can be RAG and having a great data set to RAG against. Or that can be having just a great pipeline of trimming down the user queries or doing all sorts of things. And I think that's where we've put a lot of effort. It can be very hard to make v0 write bad code sometimes. We do a lot of work to clean up the outputs. We're always improving that, though. And I think it's really important that when you're able to switch models, you have to build around all their little mistakes and you have a really resilient product that way. And now if someone goes down or if we want to switch to our own Vercel model someday, that's a very easy thing for us to just plug in and do. [0:15:02] KB: I think this gets into one of the interesting things in this space, which is that, I think led by OpenAI, the big model providers are often very fuzzy in their language about what's happening in the model layer and what's happening in applications, right? We'll talk about - oh, people will say, "I'm using chat GPT." And you're like, "Okay." But some of that functionality is coming from the model, and you'll talk about that. And some of that, it's making an API call off somewhere else, is doing a tool call. Some of that, it's RAGing in the right data. And it gets very fuzzy and people aren't breaking those down. One, I love the idea of separating that and be like, "All right, this has to be model agnostic." Some models might work better for this, some might not, but the fundamentals of it are agnostic. [0:15:44] ML: Exactly. [0:15:45] KB: I'd love to dig into - understanding you can't go into the specifics of your pipeline, but do you think about what those different components are in building an effective application on top of an LLM? [0:15:56] ML: Yeah, I think one good way to illustrate this is to think that when V0 started, it was August 2023, I think we started working on it a month or two before that. We had GPT 3.5. 4K context, I think. It did not know about shadcn. Did not know about new React features. We were starting with a model that did not inherently know what we wanted it to do. And we had to teach it all of those things. And I think the key thing is having that data set and having a data set that you can RAG against and then going really deep into the RAGing. How do you chunk your content? How do you embed your content? Does it make sense to embed the user query? Does that user query really map to whatever your responses are if they're code or images? Evals are really essential for doing that sort of thing. We've been on a wild ride figuring out how to do evals before anyone really talked about them or at least publicly. It's been a fun ride. And I think always treating it like it's August 23 works well for us. The model is always dumb. What can we do to make it smart? [0:16:49] KB: That is the fascinating thing I've found with these models is like they seem smart at first chance. And the more you use them, the more you're like, "No, it's dumb. It's dumb. But it can do some amazing things. So how do we make it smart?" [0:17:00] ML: Absolutely. And I think a big unblocker for us with how to make it smart was digging really deep into how users used it. Because when you give someone just like a prompt form or a text box, they can type whatever they want there. They can paste crazy things. They can be really rude to it. They can speak not English. When I talk to v0, I often send like two words. I don't write down my whole sentence. Because I know at this point what it will understand and take away from that. You have to learn those things first. And that just takes a lot of work and grinding and working with your model. [0:17:29] AK: I think the second part of that is like once to have a really good understanding of what your users think your application can do, you can make as many of those things non-AI-based as possible. What we've tried a lot of is like, "Okay, the LLM is dumb. We know that it's dumb." If there are simple things that we can do deterministically, especially if we have a very good idea of what users want in those cases, we can pull those out, don't have the LLM actually generate those, and do those deterministically ourselves. And we found that's a really, really good way to enhance output, which is let's not try to give the LLM infinite information and be able to do everything. Let's actually limit its scope, make it do something very, very well. And as much as we can pull out of that, that doesn't need to be done with LLM, let's not do it with the LLM. One example of this for v0 is people very frequently want to change the text that they see in their UIs. There are two ways that this could happen. One is you put in some new text, we give it to the LLM, and we're like, "Hey, LLM, go change this text with this new text." The problem is this is a non-trivial problem for the LLM. It has to go find where you need to change. It has to do the exchange itself. And it takes a really long time. The user is waiting for an entire model call to happen and be returned to actually make this change happen. And what we found is that like, "Okay, the user actually knows where in the UI they want to actually change the text." We can use source maps to go back to the source code that actually renders this text, and we can deterministically change the text there. And we can do that all without, A, making a model call. Which means, B, there's very low latency for the user. And C, it's deterministic. It will work the same way every single time. The user can get a very good understanding of what's going to happen. And finally, it's more accurate. We do evals on all these kinds of things. But when we do deterministic change, the chance that it's correct afterwards it is way higher than if we do a non-deterministic AI-based change. [0:19:16] KB: Yeah. No, I like that a lot. One of the things that I've been playing a lot with in my day job is we'll tell the LLM, "Here's a domain-specific language that you can use." And then when it uses it, then we deterministically render the pieces of the DSL out into useful code or other things that might happen. And I don't know if this is the case, but I can imagine with something like shadcn, you could expose to it the component model but then just deterministically drop in the components as you need them. [0:19:43] ML: Right. Exactly. And I think something else we do similar to what Ary was saying about the refinement is what we call the selecting something and being able to type it without an LLM call. For example, LLMs are often lazy, so they will emit code. They don't want to write it all out. And we have a second pass that fixes the laziness. And I think that's a really powerful idea of don't spend a ton of time trying to prompt out these behaviors that are inherent to the models based on their training or whatever. Work with them. If you wanted to output JSON and it's messing up the JSON, make or find a lazy JSON parser and fix that JSON. Don't try to keep retrying the LLM calls or wait for it to work. Make it work for yourself. And that's how, as I was saying before, like you build a really resilient product. [0:20:23] KB: Yeah. Another thing I've seen in this space that I'd be curious, and maybe you can or can't say something about this, but tools like Cursor not only have the core LLM call, but they have a diffing model that's like, "Okay, how do I apply this to the code that I already have that's a separate model, totally fine-tuned, on its own?" Having different layers even if you are using models rather than deterministic output. But having different layers with more refined purposes can really up your reliability. [0:20:49] ML: Absolutely. And that's a great case for fine-tuned models or small models or something you don't want for your main response, but you can use them to clean up that main response. And I think that's something super powerful that we'll be seeing a lot more of in this agentic 2025. [0:21:01] KB: Nice. Let's maybe talk a little bit, as we're on V0, about shadcn. You mentioned that's kind of the default component model. Can you explain a little bit what it is and why use that over something like material as your default? [0:21:15] AK: Yeah. I mean, shadcn is an open source component library built for React, primarily, which gives you the most basic building blocks that almost every web application needs. You can imagine the kinds of things you're getting here are like, okay, cards, buttons, badges, dropdowns, modals. These are all very, very basic parts of any kind of web application that gets built these days. What shadcn gives you is it gives you the code to render these render these components. And it's not an install. It's not a package or anything you install. It just gives you the code, you copy paste that code, and it gives you a really, really great starting point to customize and build your own design system. And so the biggest unique part about shadcn is that you are not dependent on some third-party system where Material UI can update their package at any given point in time and change your design system and you have to work around it. Shadcn gives you building blocks to start off with that you can customize at will and kind of build your design system from scratch around it. And yeah, that's shadcn in a nutshell. [0:22:14] ML: And I think it's also important to highlight, shadcn components are built on Radix UI, which is a headless library. It means it has no styles, but it's very accessible. You're starting off with shadcn. You're like set up for success. It's accessible. It looks good. It's very customizable. And it turns out that it looks really good without a lot of code around it. LLMs are actually pretty good at throwing it together because they know there should be two buttons at the end of a form. It doesn't necessarily need to know what that button looks like to ensure it will look okay. It helps to give it some knowledge to make it look good, but it's a great place to start from where you're guaranteed kind of something that's reasonable. [0:22:47] KB: You highlighted the thing that's different is it's not a third-party dependency. It's essentially copy-and-paste, which is kind of an inversion of traditional best practice. What's the rethinking behind that? And I think that it's ingenious, particularly in an age of LLM-based coding. But I'm curious how you all think about it. [0:23:06] AK: Yeah, I think there's two reasons why this works here and it maybe doesn't work everywhere else. The first is shadcn is not a static utility that, once you use it once, you want to use it exactly the same every single time you use it. Typically, you want to import a package when it's like, "Hey, this is a date formatter. It will work exactly the same way every single time." And I want it to work exactly the same way every single time. If there are changes and updates on their end, I actually want to get those changes and updates. That's the second piece. Typically, A, I want it to work the same way every time. And if there's a change, I actually want to get that change and be able to use the new version of it. For a design system, typically, both of those are not true. A, you don't actually want to use a component the exact same way every time. Components change and grow over time. It ends up being a block for you if you're like, "Hey, I actually need to modify this component slightly." And suddenly I like no longer have the ability to do so because I'm importing it from some package. I have to wrap it and like use a wrapper for it instead and then change the wrapper. It ends up getting kind of complicated. And the second is you don't necessarily want all of the updates from shadcn. If he releases a new component, you can just use that new component and copy-paste it. But if it changes the interface by which you interact with an existing component, you don't necessarily want to see that, get that upstream change. And so that's why our thesis is that it's actually better for this kind of design system use case to not have it be a package that you import and consistently keep updated. [0:24:30] ML: Yeah, I think the key there is that you're able to change the components. You really are - once you install them, they're the default shadcn components. Then you can customize them. As you're building your app or building your website, you start changing the border radiuses, you start changing the colors. And all of a sudden, it's yours now. It's no longer really standard shadcn. [0:24:46] KB: The approach here reminds me a lot of what you were talking about with regards to higher level abstractions in your AI SDK, where you're not providing, "Here's an installable thing. It's giving all these things you have to do." You're like, "Here's a building block. Go. Create." [0:25:00] ML: Right. And I think that's all because we're building it for ourselves. This is what we would want. And we would get upset if we couldn't change our button styles or if we couldn't modify how our RAG works. [0:25:11] KB: Bringing back a little bit to the Ai SDK, we talked about Core, but there was also SDK UI. Is that related to shadcn? They're completely distinct? They play nicely? What is the relationship there? [0:25:22] AK: Yeah, they're completely distinct. The way to think about UI is it is utilities for you to use on the front end to do basic parts of AI applications. A very good example is the most basic thing that people are building with AI is still chatbots, right? There's a lot of parts of a chatbot that are shared across every single chatbot that's ever been made. You have to have some kind of messages. The messages are either there's a role for what those messages are. They need to be rendered on the front end and they have to be - typically, the AI messages get streamed in from some third-party, from some API. Those are shared across every single chat application no matter what. The thesis with UI was like, "Hey, we don't really need you to come up with these abstractions from scratch every time you want to build an AI application." We know that you're going to build some kind of message object. We know you're going to render that message object. And we know you're going to need it to be streamed in. Why don't we give you good abstractions for a message, for example, that you can use out of the box? And that way, you can get started with an AI application really, really fast. And the goal, again, the balance here is we are giving you these abstractions, and the goal is to keep them as low-level as possible such that you can customize them and use them basically at will while still giving you the benefits of some abstraction where you're not worrying about the low-level of how am I streaming in content and rendering it in the UI optimistically or not optimistically. UI kind of does that part for you. [0:26:48] KB: Got it. And for both of these libraries, you mentioned that the team works very closely with the v0 team. Are these going to be community projects at some point? Or is Vercel more or less run them top to bottom? [0:27:00] AK: I mean, AI SDK is an open source framework. Anybody can contribute to it. We have a lot of community contributors to it already, especially when new providers come out and new models come out from new providers, the community typically builds providers for the AI SDK for those. The AI SDK, totally open source. Already very community-driven. V0 right now is a Vercel-branded, Vercel-run product. It is built upon shadcn, which is itself another open source. You can open a PR to shadcn today if you wanted to. And that's kind of how Vercel thinks about thinks about all of its products. Even our managed paid products are built on top of very, very strong open source foundations. It's built into Vercel's DNA. [0:27:40] KB: That's awesome. Looking now at these things, and you mentioned one source of changes, new models, things like that. You also said that, as you uncover needs, often from v0 or other things, those get sort of baked down into the AI SDK. Where would you say kind of the growth edge is for these libraries? What are the pieces that aren't quite where you want them to be yet or where these libraries are really evolving? [0:28:09] ML: I know something we're actively thinking about a lot. There's an RFC still, I think, on the GitHub, if anyone's interested in poking their head in there, is agents and OpenAI really swarm. And how do you coordinate multiple models together, and then also render those outputs on the front end in a nice streaming-friendly manner perhaps? That's a lot of where our headspace is at right now. I think models have gotten so good, you can now let them kind of do their own thing for a little bit. You can give their outputs to other models. And we need some kind of tools to help pair those together a little bit more. [0:28:37] KB: Is that thinking advanced enough you could sort of explain a little bit? How is the AI SDK thinking about agents? Or how do you at v0 think about agents? [0:28:46] ML: That is a loaded question because I know there's a lot of discourse right now about what is an agent? [0:28:52] KB: Hot take. What is an agent? [0:28:54] ML: I have put a lot of thought at this. I still haven't convinced myself fully. I go more on the second. It's not just giving you an output. When you are having it do tool calls to third-party services or having it make decisions besides what to respond, I feel like that's agentic. You have multiple steps and it has to make decisions. And that's already fully supported. But now it comes into this like multi-model agent where you have this whole graph of all these different things calling each other. And that's a really fun problem that I don't think I have a great answer to on if I consider those more agent or not, I guess. [0:29:25] AK: Yeah, I think people typically think of agents like there are two requirements. It must be able to do something, right? It must do something that is not just return text. And the second is it must be able to take multiple steps to do the thing that it wants to do. Typically, I would say if it satisfies both of those criteria, we call it an agent. There are obvious questions about, "Okay, if a single call can do multiple round trips of a tool call, is that agenetic behavior?" Maybe. I think that's where the discourse gets super granular and very in-depth. But I think in general, it's like, "Okay, if you're taking multiple steps, and it is doing something that is not just returning a text response, we would call that an agent." [0:30:07] KB: You described in v0, for example, loading different types of contexts, making decisions about doing things deterministically, doing other stuff. Would you call v0 an agent? [0:30:19] ML: Yes. Again, some people might argue about that, but it is definitely an agent. [0:30:24] AK: Yeah. I think the way people typically think about agents is that, "Oh, if v0 generated a UI and then fixed it and then added a bunch of features one after another all without any human intervention," that's what they imagine when they think agent. We already do a lot of steps for every generation that you get behind the hood that are hidden from the user. And v0 does a lot of things that not just generate text response. I think we probably fit those two criteria. Although I know when people say they want an agentic v0, they typically mean I wanted to take multiple steps and generate a bunch of UI all at once. I mean, it can take longer to do that. But yeah. [0:31:01] KB: How do you think about how v0 plays into the product development process? And I know this is an active conversation all over the place of like, "Okay, what sets of things do you do with a v0 or a Replit or something like this where you're generating something from scratch in kind of an agentic way?" And then when do you export it out and to handle it a different way? I'm curious, what's your perspective? And since it sounds like you dogfood everything, how do you all use v0? [0:31:30] AK: The easy answer to this is that we have found that people use v0 for everything. Internally at Vercel, we have a channel on our Slack called How I v0?, which is just people posting how and be zero for various different things. We have found that people use it for way crazier things us on the product team ever would have imagined. In general, I think we've seen three to four big, large use cases. The first is early in the product development process when you're still at the ideation feature definition stage, typically, people spend a lot of time writing feature requirement docs, or trying to build low-level mockups, or some way get across whatever that product vision in their head is. V0 is a super easy way to get that across. And we've had multiple features at Vercel start as a v0 generation by some maybe non-technical person and then actually make it to production because it was very clear what that was intended to be. The second thing we've seen is people start with designs, and they want to get an 80% way there very, very, very fast. They don't want to spend three days scaffolding out what this design is going to look like to build the front end for it. V0 is very, very good at getting 80% of the way there once you have a given design. And the third, the most v0-pilled way to use v0 is use it as a quasi IDE, basically, where you start the process there, you brainstorm, you kind of work with v0 to define what this is and they look like. Then you build out the business logic. V0 can write full-stack Next.js code. You can build out the business logic in v0 itself. And then at some point when you're ready to deploy, v0 allows you to deploy to Vercel super, super easily. You can really do the full product development process all within v0. And that's the most v0-pilled way to go about it. The final thing I'll quickly mention is that we've seen a lot of just non-app development use cases as well. Things like our marketing team, when they have to build visuals for data that they have, they give v0 the data and ask it to build really pretty charts. They can customize that. It's all backed by code. V0 can execute Python and execute in a Node.js environment. People are actually creating scripts and running scripts, doing SQL queries, all that kind of stuff on v0 as well. There's a whole fourth category of non-application development use cases that we found we believe is v0 for. [0:33:50] ML: And I would just throw out that I think a surprising amount of v0 has been built, at least at the beginning, with v0. It is such a good tool for - I am someone that I love writing code. I am not great at writing CSS or styling my websites. V0 is a fantastic tool for me. I do all the implementation myself where I'll v0s start it. I give it the implementation and I'm like, "Make a UI." And it is fantastic for that. And I think for people that know how to code but don't know how to build frontends, it is an amazing tool right now. [0:34:17] KB: Let's maybe dive into a little bit of that back-and-forth. You said, "Okay, I give it my code and ask it to build a frontend." What does that management of code process look like? [0:34:27] ML: Sure. I'll preface this with we're actively exploring this. We would love to be able to have you plug in and get repos and sync it with Vercel. And that is on the roadmap. Right now, I will take my schema file or whatever file I'm working in, paste it right in. I might have a chat already where I made that file. I just go back to that chat and be like, "All right. Now, I'm working on this part of it." I think those are the two main ways you start with v0. And as you use v0 more, you have chats about the things you're working on. You can go back to those chats. You can fork your old block. And you can have a lot more of your kind of a Git message, but it's a whole chat now. You have all this context about how you got from point A to point B stored in this chat. And I think it's super powerful. [0:35:08] KB: If I'm hearing you correctly, right now, it's not directly connected to Git. If I wanted to go all the way v0-pilled, and I wanted to build my whole application with this but I'm a seasoned, perhaps burned a few times engineer, and I want to make sure that I'm committing things along the way and things like that, am I copying and pasting? Is there an export, if not an import? What does that look like? [0:35:30] ML: Right. You can copy and paste. Going back to shadcn, we always want to support the copy and paste. We also have the shadcn CLI. That's a tool that shadcn wrote that lets you easily update and install your shadcn components. You can also give it URLs to your chat or to your v0 block, which is the website in that. And it will pull all the dependencies, it will pull all the files, and it will make sure and installed and integrated nicely in your project. [0:35:52] KB: Oh, so that's interesting. I can have a chat with v0, end up with a component of some sort, and then use the shadcn CLI to pull that down into whatever repo I happen to be working in. [0:36:01] ML: Exactly. And then you can edit that file if you want or you can go back to v0, keep working on it, and run the command again to update it, or copy and paste. [0:36:10] KB: That's really interesting. [0:36:10] ML: And it's cool that - not super v0 related, but we're seeing more products around the web pick up this shadcn CLI support. It's an open source. You can make your own registry. One big one is Framer Motion, or motion.dev now. They have open and v0 buttons and like shadcn blocks. You can use the registry to pull motion components into your React app. That's one way. Again, we're taking this v0 feature and we're trying to share it with more of the open source community. [0:36:37] KB: One of the things that I think all of us and software are grappling with right now is how these tools change what our job looks like. Y'all are not only building the product that's changing our jobs, you're using it to build itself and kind of doing that. How do you think about what software development looks like in this LLM world we're in? [0:36:58] ML: For me, it really is you can just build more. Instead of you working on one thing at a time now, kick off three v0 chats, I'll tab between them and I'll be working on three things at once while another page is responding. And I think as we go down an agent route and all these models start doing more multi-step things, you'll have more time to wait while your output is being done, which gives you more time to do more things. Maybe it's naive of me. I really am thinking it's like a force multiplier and that's how I've seen it help me. I just am able to produce more code. I might have to review it a little more carefully if v0 wrote a lot of it. But that's still faster than me sitting there and writing it off my hand. [0:37:32] AK: I think it also lets you work at a higher level of abstraction, especially when you're working with syntax that you're not super, super familiar with. When you're working with new APIs, new functions, new languages, anything like that, AI makes it really easy for you as someone who understands computer science and principles behind development to build things really, really fast without having to spend the time going through and making sure your syntax is accurate and looking through documentation to figure out what functions call and stuff like that. Very frequently what I'll have v0 do is I will give v0 a tech plan of sorts where it's like, "Hey, I need to load this data in a server component, pass it down to a client and write these hooks," which is like in my head how I plan out what I have to build. And I'll just say like, "Okay, you go build that." And so the architecture decisions are still being made by me, but the implementations, I don't have to worry about because those are relatively straightforward given the architecture decisions. [0:38:30] ML: And it certainly helps if you're technical. Maybe you can guide it a little bit on debugging or what to use. We see a lot of people using v0 that are not technical and we just try to make it really easy for them. You might get an error screen if v0 writes wrong code or you visit a page that doesn't exist. We give you a nice little button that says, "Fix v0." And you hit that, and now your error message is sent to v0. Trying to streamline the development process also makes it easier for us, so we don't need to copy and paste the error message, and people that don't know what an error message is. And I think that's been really powerful for the adoption of v0, and then for all sorts of people at companies to be able to use it, not just the engineering or product teams. [0:39:04] AK: In fact, the top user of v0 at Vercel is not an engineer. Actually, I think the top three people who use V0 at Vercel are all non-engineers. [0:39:13] KB: That's kind of interesting. And you have this whole internal channel made with V0 or whatever. What are some of the most novel use cases you've seen for this? [0:39:24] AK: Some interesting ones that are maybe not super novel, but were not things we were thinking of, is our sales engineering team will use v0 to build customized demos for prospects. Instead of coming with a slide deck, they will come with a demo of whatever it is that they're trying to explain to the prospect, which works way, way better. And what they can do then is, live during a customer call, they can modify the demo based on things that the prospect is saying. This is not something we had thought about at all when we had built v0 in the first place, but we've seen tremendous usage among our sales engineering team to make that happen. [0:40:00] ML: I think one of the coolest demos I've seen was - I think Guillermo, our CEO, is giving a presentation or some talk. And he gave the presentation using Google Slides and everything. And at the end, he had escaped, and it was Google Slides built entirely in v0. The UI matched. It started with a screenshot. You could add slides, edit your slides, and hit present all in this v0 generation. I love talks that do that. I thought it was really cool. And actually, we keep using that. We keep forking the block and changing it for different things because it's such a fun way to build a presentation. [0:40:28] KB: No, that's really interesting. And I think one of the things that these tools are enabling is that much more rapid iteration of like, "Okay, I'm building a demo, then I'm talking to them, then I'm changing it, and it's evolving." And that lets you really explore much more quickly. I am curious on the productionization side. And this is the classic engineering concern of like, "Okay, it's great to build a demo super fast." And the CEO says, "All right, that's great. We're shipping it next week, right?" What does the productionization path look like? You alluded to this a little bit. But how much work does there tend to need to be to get it from the 80% demoable version to the 100% I can sell this to customers version?" [0:41:11] ML: I think it depends on a few things. One is your experience and your ability to code. We have people that have 200, 300 versions of their block in v0 or their generation. And those will often be 20, 30 files. It'll have Supabase. It'll have a login and auth. And those are effectively production ready. People hit deploy and they deploy them to Vercel. It took a lot of prompting for them to get there, but they got there. If you're trying to beat a deadline and you're an engineer, you might wanna pull it out of v0, put in your code base and start working yourself. And those are both fine. And we've seen both happen. I think what's key is that like by giving v0 the up-to-date knowledge and how to use Next.js, how to use React web standard best practices, all this knowledge we have at Vercel that we've given v0 to try to make it the best web development agent or LLM, that all pays off in dividends when you actually go to deploy your application and like, "Hey, it added rate limiting to the server route." Or it properly didn't block on this really long request or something like that. [0:42:09] KB: Yeah, it is interesting. I find particularly some of the models from Anthropic like Sonnet, they write better code than I do when I prompt them correctly, right? And if you can code some of that prompting knowledge into your application layer of v0, dang, you get good stuff out. [0:42:24] ML: Yeah, absolutely. I think one thing we had on the original v0, which we didn't talk on, but that was not a chat app. You gave a prompt, you got generations, no chat. We gave you three versions of something. And that's because models were a lot worse then, so we kind of had to try it three times and hope that one of them worked really well compared to the other two. And I don't know. I think still, there's something there about giving people the choices and the ability to kind of pick and choose. [0:42:46] KB: Well, I think that is one thing that's key in all of this use of LLMs for stuff is like you can't turn your brain off, right? You need to be engaged. You need to be checking, "Is this actually doing what I wanted?" You need to look at it and approve it or correct it. But yeah, it makes that in-between process of I have an idea to let me see a first version of it just go so fast. [0:43:10] AK: I would say that the analogy I often use here is that it's like a higher-level programming language. It doesn't mean you need to stop programming. When you stopped writing Assembly and you started writing C, it didn't mean that you stopped programming suddenly. It meant that you were still just using programming concepts at a higher level of abstraction. You still have to be able to do that in some capacity here. And this is just another higher level of abstraction that you're now able to work at. [0:43:36] KB: Now, you all said you started this project. If I recall, it was like summer 2023, something like that. We're now a year and a half along. If you project forward six months, a year, a year and a half, what do you see happening in this space in general and with v0 in particular? [0:43:55] AK: Yeah, I think we can maybe talk about what's happened since we started and then what happened in the future. As Max mentioned, when we started v0 in October of 2023, it was OpenAI's 3.5 model class and 4K context window, which now seems like the craziest set of assumptions to begin with. A few things have changed. One, models have gotten 10x smarter. And we expect that that will continue. The cool thing about building applications as a layer on top of base LLMs is that, when LLMs get better, the applications we build get better. The AI SDK makes it super easy for us to keep of like state-of-the-art model as it comes out. The number one thing we expect is models will get better. Our code quality output will get better. And LLMs will get better at coding. The second thing is that context windows are increasing. At first, you could only give the LLM a little bit of information alongside your prompt to make it do whatever it is that you wanted it to do. You can now give it probably 10x that amount of information. And in fact, Gemini has released models that have an infinite context window. You can give it as much information as you want. The ability to get custom personalized answers is getting better. Obviously, infinite context windows are not as good at dealing with all of that context as shorter context windows. There's more abstractions and more work to be done there. But in general, you're getting more personal, better responses. And the third is that we're getting better at just building the software 1.0 layer here of building good web applications around AI. We're getting better and better at being able to do that. Things like being able to do some things deterministically. Being able to use different models for different pieces of our generations and things like that, those are all only going to get better. [0:45:45] ML: I think it's really clear now that all the major model providers, they're aware of the v0 style or Anthropic artifact style use case of generating code, probably React, and shadcn, and Tailwind. Now we're expecting models to get even better at maybe that subset of technologies because it's so popular right now and it's so good for us as developers. [0:46:08] KB: I'm going to push a little bit more because, yes, we can expect the models to get better. What do you see missing right now in v0? What would you want to build or what are you already building that's going to make this - [0:46:21] AK: I would say one interesting case is - let's talk about error rates, right? Given an LLM output, there is, let's say, an X% chance that like that output has some kind of error in it. Right now, our problem is, when we want to do more complicated tasks, we can't stack those on top of each other because we compound error rates over time. If we have a 10% error rate on one generation and we have four to five different generations happening, you now have like a one minus X% to the nth power chance of that being wrong, which compounds very, very fast. Right now, with the error rates that we see in models, even a three to five-step process is not something you can trust the LLM to do. Because even in just five steps, you're at a 50% success rate. Just a tiny increase in the success rate of base models allow us to stack model outputs and feed in the output of one LLM into another LLM in a much more agentic fashion and just do more complicated things as a result. And small increases in accuracy dramatically affect our ability to do that because of the fact that these errors compound over time. I think that's one example of at least something I'm super excited about is that as models get better, even slightly, incrementally, we can suddenly stack a lot more of those together without seeing a huge massive increase in our error rates. [0:47:41] ML: And I think we'll also see LLMs get better at using just modern technology they're not necessarily trained on. A big thing documentation sites now we're doing is they're exposing llms.txt. You can visit this .txt file and it has all of the docs in one file. You can copy and paste those to an LLM or run your RAG and chunking on them. And I think tricks like that are becoming so much more popular. Anthropic has MCP, which lets models connect to basically integrations. V0 lets you use environment variables from your Vercel projects so you can connect to anything your production website can. I think we'll see a lot more of these integrations and third-party experiences kind of merging. Why can't you render part of your Supabase dashboard in v0? I think you will be able to soon. [0:48:24] AK: And I think part of it is we've been talking about changes that model companies and at the AI layer, what's going to change to get better. But the world is changing to adapt to how people use AI. And I think the llms.txt is a really, really good example of documentation sites are changing how they do things to be better ingested by LLMs. One of the reasons, for example, shadcn works so well with LLMs is that the styling and the component structure are co-located. You have inline styles, Tailwind class names styles, where the structure of the component and the styles for the component are in the same place. And one of the reasons why this is so good for LLMs is that they no longer have to keep track of two different files and try to keep them together. More people are adopting shadcn, and more people are adopting React, more people are adopting Tailwind because LLMs are good at this kind of thing. Basically, the world is also changing to reflect what AI is very good at. [0:49:19] KB: Yeah, absolutely. We're learning how to use these incredibly powerful tools and we're building all those connecting tissue. I love the idea of being able to pull in your dashboard from Supabase or whatever, run it in your v0. We were talking about import my Git project. Make these changes by chat, export it, deploy it. Do all these different things. That's an exciting world. We're getting close to the end of our time together. Are there any things we haven't talked about that you all want to make sure that people take away knowing about v0, the AI SDK, shadcn, any of these different pieces? [0:49:56] ML: I think I would say that we are building faster and faster all the time, largely and thanks to v0 and AI. I would expect that to be true for a lot of LLM applications. But they're only going to get so much better from here at a much quicker rate. We're all figuring out what works, what doesn't. People are writing about it and sharing. And I think that v0 and AI applications, if you've used them like a year ago, revisit them. Try them again. Everything is just so much better now. [0:50:23] AK: You can get started using all of them for free. You can just go to v0.dev and try out some prompts. Play around with it, see what comes out. The AI SDK is totally open source. Now you go to sdk.vercel.ai. You can see the docs and start building applications with it right away. And Vercel has a lot of great templates to actually get started with all these kinds of technologies. V0 also has templates to get started with building really cool designs. Basically, there's no excuse to not be coding. [0:50:54] KB: Regardless of if you're a coder. [0:50:55] AK: Yeah. [0:50:55] ML: Exactly. [0:50:56] AK: Everybody can shift. Everybody can cook. [0:50:58] KB: I love it. That's a great place to stop. [END]