EPISODE 1800

[INTRODUCTION]

[0:00:01] ANNOUNCER: Agentic AI is seen as a key frontier in artificial intelligence, enabling systems to autonomously act, adapt in real time, and solve complex multi-step problems based on objectives and context. Unlike traditional rule-based or generative AI, which are limited to predefined or reactive tasks, Agentic AI processes vast information, models uncertainty, and makes context-sensitive decisions mimicking human-like problem-solving. CrewAI is a platform to build and deploy automated workflows using any LLM and cloud platform. The company has rapidly become one of the most prominent in the field of Agentic AI. João Moura is the Founder at CrewAI, and he joins the show with Sean Falconer to talk about his company.

This episode is hosted by Sean Falconer. Check the show notes for more information on Sean's work and where to find him.

[INTERVIEW]

[0:01:04] SF: Joe, welcome to the show.

[0:01:05] JM: Hey there. Thank you for having me, Sean. Very excited to be here to talk about AI agents.

[0:01:10] SF: Yes, yes. Yeah, before we get there, so you're the CEO and Founder of CrewAI. What's the story behind the company? How did all this start?

[0:01:18] JM: I got to say, Sean, people ask me this, and my wife hates when I tell it, just because it involves her, but I was working Clearbit for five years prior starting CrewAI. And in Clearbit, I was leadering their enterprise product and also, all their AI initiatives. I was there for all the way to the acquisition by HubSpot. When we got acquired by HubSpot, I was already playing with agents in my personal life, for myself, basically building agents to help me do better at LinkedIn and do better at X. My wife has telling me, "You're doing so many cool things at Clearbit. You should be posting about it, right?"

I'm very good at posting on X, but I'm not that good on posting on LinkedIn. I got a few agents to help me out, and I got hooked. It was so easy. I was getting so many posts on a very consistent manner. I was like, you know what? I want to be more agents. I looked around, I found nothing that could help me do it the way that I wanted to, so I decided to build my own, and that's how the open source started. From that, to make it into a company was just a very organic process of companies reaching out to me and saying like, "Hey, we're actually using CrewAI in production. Can you help us out with this?"

I realized, there's no way for me to provide that level of support, by just making it an open-source project, like it needed to have funding behind it to make sure that we could do it. That's how things came to be, and I got very excited after that.

[0:02:45] SF: That's really, I think, the dream scenario for almost anybody who ends up starting a company is like, you build something that's useful for you, starts as this side project that's doing something that you're happy with. Then suddenly, people are knocking on your door and saying, "Hey, we want to pay you to use this project, or we're already using this project." I mean, it's awesome. I think it's interesting what you said around doing some of these agents to help you do some of your work, like posting on LinkedIn. Actually, for myself, one of the first public AI-based applications I built also was for helping me analyze podcasts that I do and then generate a first draft LinkedIn post, because I don't always have time to put those things together, but I want to do the guest right and help promote those things, and so on. That was one of the first projects. Now I did not, at least as yet, turn that into a business, so it's amazing that you're able to go from those humble beginnings to actually creating something of real value.

[0:03:40] JM: Yeah. I got to say, for me back then was just like, I want to make sure that I was finding the time to post this, but as you know, if you want to write something that is really good, that has that emotional connection that taps into your like experiences, it takes a while. It can take a few hours for us to think something true, right? The thing was this would help me get from a crazy one, two line idea to a two, three paragraph post, that then I could basically improve in five minutes and it was ready to go. 

It was funny, because I preloaded those agents with some knowledge, so they knew about my resume, they knew about my life experiences. I wrote things about the day that I fell from my bike, or the day that I broke my jaw and now that. I got all those things in there. It would draw all those interesting correlations that I think make the stories very special, that then I could spend a few extra minutes making sure that just hitting the points that I really wanted to. That was great. Yeah. I still remember being on a meetup in the Bay Area. I think it was on Shack 15 and people from Oracle reaching out to me and say like, "Hey, so we're using CrewAI in production." That was a big I have moment for me. I was like, "What? Oracle is using CrewAI?" Yeah, from that point on things starts to really, basically, get a lot of traction.

Honestly, I'm very grateful. If anything, as you said, I know how lucky I am from getting to work on something that I love so passionate and also the fact that we have such a nice and incredible community behind us. I take none of that for granted.

[0:05:16] SF: Yeah, that's awesome. Yeah, and I feel like the company itself and the product has jumped on to my radar maybe in the last six months and I feel like I'm hearing more and more. Has the growth been crazy during that time? Has there been really - are you seeing, essentially, this wave that's happening around agents and interesting crew?

[0:05:32] JM: Yes. I got to say, it's record after record. We keep adopting records every week or so. It was funny because on the Christmas week, we broke our record in how many crews we executed in one single week. Each crew here for the people that don't know about this is a group of AI agents, right? You could have two, three, five, seven. I have seen crews with as many as 21 agents on it. In this crew, the reason why these agents are grouping together is because they're trying to fix a specific problem, or they're trying to automate a specific process. That is why they're packed into a crew. On that week of Christmas, we ran over 3.5 million crews. Those were tens of millions of agents in one single week. That was our record high.

Then fast forward a few days, and I think a couple of days ago, in one single day, we ran 1.3 million crews. It's insane how much traction things are getting in. It's funny, because there's the open source, there's an enterprise, there's socials. There's all those different components that plays into the company. I think open source and the enterprise adoption has been very impressive. I think this year is going to be a little insane if things keep moving the way that they are.

[0:06:55] SF: Yeah, that's amazing. I think that's a good point to maybe just slow down a little bit on where we're going, just for people who are a little less familiar with this topic and talk a little bit about what is an agent and maybe get your perspective on this. Because clearly, agents that are having a huge moment right now. Even if you're not in AI, I'm sure you're hearing all about it. People are making grandiose predictions for 2025 about that. There's a variety of definitions out there. You can get everything. I've seen definitions of agent that basically, suddenly everything is an agent. There's nothing that's not an agent. Then there's really specific definitions, like agents have to be using tools.

In AI itself, there's a long history of AI agents. It's not just about gen AI. Even going back 1960s and 70s, you had rule-based systems. It wasn't generative, AI or LLMs, but you had this notion of an agent doing some work for you. Is the definition of a gen AI agent different from the historical definition, or is this just a different algorithmic approach that's happening behind the scenes to power the agentic capabilities? How do you think about the definition of an agent?

[0:08:03] JM: Yeah. No, Sean, that's a great question. I think it's something else. I think it's something different. When you think about AI agents nowadays, I think it's mandatory for you to have the AI dictate the flow of the application. If you have a traditional software and you basically have a class with functions and the functions were calling each other and you have an API call that calls AI in there, that for me is a workflow. That's a script that is not an AI agent.

Now, if the AI, the LLM that you're powering this is actually controlling the flow off the application, and by that, it has agency, then for me, that is an AI agent on its current definition. Yes, I mean, there's workflows, there's automations. Honestly, if you forget the name AI agents for a second, we're talking about AI-powered automations. That's the boog of it, right? I think what throws an extra spin on it and make it interesting for unlocking automations that were just not possible before is this ability to not only generate the content in real time, but also change the flow of the application in real time, so it's not always a predefined path of graphs and nodes where it's doing always the same thing, but you're extracting the full value of the LLMs by having the LLMs itself control it.

[0:09:26] SF: Right. Yeah. I mean, historically, when we think about engineering anything, it's essentially a person that is coming up with like, if this, do this, and this and this, and you're creating this deterministic flow. But with the agent, you're handing off that decision making process to the agent to figure out what is the plan of execution in order to accomplish a goal. I'm giving it a goal. Just like you would give a person a goal and they go and they essentially figure that out.

[0:09:52] JM: Yeah. I would say that it's more of a spectrum more than anything else. What we're seeing out there in the real world for the thousands of use cases that we have been building so far is that it's not one or the other. It's a spectrum depending on how much precision you need for your use case. If your use case require a last precision and you can rely more on those models, usually that means that you're generating more value, just because you can have less people overseeing this, these things can run over and over and it's on. Then you can have that just go for it. Sometimes for certain aspects of the automation, you want to have more control. You want to have programmatic control of what it's going to happen.

We allow you to do that as well on the feature that we call create flows, where you can basically say, all right, if this, then something else is going to happen. That's basically a producer and observer event listening approach. That makes it actually flexible, because you can decide when you would have agency and when you want to have more control. I would say, it's more of a spectrum than one versus the other, but definitely depending on the use cases and the precision that you're trying to get, you might optimize for certain aspects.

[0:11:06] SF: Why do you need something like a framework like CrewAI to build an agent, versus doing something bare bones yourself?

[0:11:12] JM: Yeah. I got to say, I love this idea on engineering that you should try to do the least amount that you can get away with, right? That's always a good way for you to get started with things. No over engineer things. I always think from the ground up, what is the minimum that I can get away with it, and that's usually a good baseline. The problem is what we're seeing out there is this agency start very simple conceptually, because they're like, all right, it's an LLM in a loop, and I'm going to give it a prompt and it's going to do something. Then you start thinking about the tools. Like, all right, so I need to give it a tool. That's not a big lift. I can do two calling. I can do JSON parsing. All right, I got that done.

Now you're going to run this at scale and they're like, well, if I'm going to be using tools, I need to have a caching layer. If I have a caching layer, I need to have an expiration mechanism for this. If I have two agents and they're working together, I want this caching layer to be shared. Well, if there are two agents, they probably need to share a memory as well. Well, if they're doing memory, maybe this memory should be a rag, so it can select and anything starts to get more and more complex.

I think where does frameworks come in handy, and I think where CrewAI really shines is it abstracts away a lot of that and give you a DSL that allows you for more simpler use cases. You can just use a DSL. But if you want to go super low level and change the prompts, changing the templates, changing all the inner workers, you can do that as well. It checks all these boxes, so that you have agent delegation, ability to communication, caching, tools and memory, all that are for granted. You don't need to review those same things over and over and over. The other thing as well is as patterns are forming, you might try to do something for your company, but the market is moving so fast that you're doing something that is suboptimal.

A framework that's getting exposes to a thousand companies are going to be able to learn from all those use cases. Those would be a few things that really can make a difference for some of these enterprises when they're using Crew AI.

[0:13:17] SF: What about in terms of the challenges around getting the abstraction right? All this stuff is so new. Things are moving quickly. You're going to have to make a decision. You're having an opinion about how people should develop agents, and then you're encoding that essentially into this framework and you're saying, this is what our opinion is about how to build these things. How do you know that your opinion is the correct one?

[0:13:40] JM: That's the thing. A lot of early days, you don't, right? That's the beauty of it, if you ask me. Early days, you don't. But then as people start to use it, and you start to basically polish these rough edges and the ideas, you start to have a more clear take on what works and what don't. I think one thing that we did with Crew from the early days is a very opinionated way. I think even how fast the market is moving, that's what people need. People want to have an opinionated way of how should I be building this? What is actually working out there? I don't want to try to think this from the scratch.

I think the fact that we had an opinionated take, if anything, help the framework to get as much adoption as it did. This idea that we from the get-go are saying hey, this is what works. I remember looking back, a lot of the inspirations technically from the DSL came from the Ruby community, specifically Ruben on Rails community. I spent quite a few years doing Rails in my career, and I always loved how they prioritize the developer experience and how there's a big piece of the language and the framework. Some people might say that sometimes it's too magical, especially Rails, and I agree with that. I think there's something pretty about the ability to make something that almost read as plain English, to the point that a demo could try as access to that technology, right?

The thing that we're trying to do is, as long as we give you ways for you to go as low level as you want, is okay to have something that is more high level and you can basically figure out pretty quickly.

[0:15:12] SF: Yeah, I do think that, especially with framework adoption. It depends a little bit on what it is and what problem you're solving, but a lot of times, DX is a big factor in terms of something catching fire and people wanting to adopt it. Especially when we talk about something like agents, which is probably not necessarily something that most people spent decades working in, they really just want to be able to solve their problem and get going and get to that aha moment as quickly as possible. If you can lower the barrier to entry and the amount of friction involved with getting into that like, hello world moment with an agent, then of course, they're going to want to do more and invest time and get into more complicated use cases, but you need to solve that problem to start with.

[0:15:53] JM: Yeah, and I guess, the other thing that adds to that is just you've got to use it, right? You can't build something and not use it. I think from the get-go, we have been using Crew so much within our company, or it sounds that it gives you clarity, right? Because you know what works, you know what doesn't. You see when you're struggling, you see new engineers joining your team and non-technical people joining your team and trying to build Crew and update Crew. It gives you the clarity that you get when you're a major user of your own product that I think it really, really helps.

I think, honestly, the community is it's a flywheel. The more people you get in the community, the more feedback you get, the more you see what works, the more you can policy badges. I think the other thing that also plays a role in this is because I have experience with enterprise software and a lot of our team also does, I think that we understand that you can just move fast and break things on enterprise, right? You can absolutely move fast, but you got to not break that many things.

I think being able to make things backwards compatible and really fighting a lot for not having breaking changes, I think those are things that are really also help drive adoption, just because people know that the docs change, that know that the features are coming, but there were very few times where we actually shipped something that was like, this is a breaking change, like you're going to need to update your code to make this work.

[0:17:24] SF: Okay. Can you break down what is the anatomy of an agent typically look like?

[0:17:28] JM: Oh, yeah. Usually, again, there's an LLM in the middle, serves of the brain of the thing. There is a task that you're giving for this LLM. This LLM in CrewAI, it's impersonating someone or a role. They're behaving as a certain role and trying to try to accomplish this task. There's going to be an output. That would be the core. Then you have tools and you can think about tools as integrations, like it can be API calls, database connections, rag implementations, whatever it might be, SAP, Salesforce, SharePoint, you name it. Those are integrations that your agents are going to be able to call on their own as they basically try to accomplish that task.

Then if you put those agents together into a crew, they now automatically have the ability to use certain special tools. Those tools are delegating works for one another, asking questions to one another, and then because they are using all those tools, they get that caching layer that we were mentioning earlier. Now, if they use the same tool passing the same arguments, they're going to hit this caching layer. Now, because they're all with the same crew, you're going to have a few different types of memory. You're going to have a shorter memory and this applies to any frameworks out there. A lot of these concepts are cross frameworks, by the way. You're going to have a shorter memory that basically allows your agents to remember everything that everyone is doing during an execution.

You're going to have a long-term memory, then it's basically going to be improving your agents across many executions. As they learn about what they did wrong, what they did right, they basically improve over time. Then, we also have an entity memory, so they remember specific definitions of things. For example, if they learn what is a software engineer, they won't need to have to try to learn that again. They basically, they already know about it. Those would be the memory.

The only other thing that I would say that it adds to this is they're usually, when you're bringing this into the real-world applications, there is usually a trigger and a destination. Those things don't run on isolation. There is usually a component of like, oh, this is going to be kicked off when a new entry in my HubSpot appears, or a new Zendesk card is created. The destinations usually, like something else, or the same. Like, oh, it goes back into my Zendesk, or it goes back into my Salesforce, or it goes into my email. I would say, high level, those would be the major components that you see in there. Then again, things can get very complex as well, if you just keep working on that.

[0:20:08] SF: One of the things you talked about there was how an agent might have a role. A lot of times, you're configuring an agent role through things like the system prompt, where you're saying you're an expert researcher, or expert copywriter, or whatever it is. What's happening underneath the covers with the LLM when you give direction like that? is that directing the LLM to a certain set of parameters, or adjusting different vectorized terms that influence each other to change how the LLM is going to interpret and respond to things?

[0:20:36] JM: Yes, a thousand percent. I think a lot of people that are using these AI APIs right now, they have a high-level understanding of how these models works behind the scenes. Honestly, if you look back to any AI model out there, or most of the AI models out there before LLMs even, if you think about prediction models, or classification models, in the end of the day, they're trying to all do the same thing. They have something that they want to predict, and they have other, what people call features that they already know. That's what people call in data science, or ML.

You have the features that you know when you have the one thing that you're trying to predict. A lame example would be what is the likelihood that's going to rank? You don't know if it's going to rank or not, but you know what is the decision of the year, you know what is the temperature, and you know your location. Giving those, if you have enough data sets from three to four years, you could pass this into a model that is going to try to create a mathematical function that giving the data that it knows, it can predict the data that is unknown, and that's the likelihood of writing. That is any AI model out there. It's what they're trying to do.

With LLMs, they're just not different. The only thing that changes is that in this case, the data that you know, the features are the text that you have written so far, or the tokens that are outputted so far. It's going to use those tokens to try to predict what should be the next token. The more information that you put before the token that you're trying to predict, or the more qualified information, the better steering power you will have on the next token. If you say it's 37 degrees Celsius on a winter, but if you change it to a summer, that have a directly impact in the likelihood of frame, right?

I would recommend people to try this out themselves. They can go into ChatGPT, or Claude, or whatever it'd be, and ask for like, "Give me stock analysis on Tesla." Then open a new chat and try the same thing, but it starts with, "You're a FINRA approved investor." You're going to see that you're going to have widely different responses, because there's two in the model in a different way. Role playing can definitely play a big role on how those models perform, if you do it right.

[0:22:55] SF: In terms of the long-term memory versus short-term memory, how that works behind the scenes is essentially, you're automatically populating the context. You're doing some context data augmentation before whatever the new behavior is, or the new event sequence that the agents will execute. For the long-term memory, you're just holding on to that as persistent, so it's always there within the context window. Is that the idea behind the long-term memory?

[0:23:19] JM: A little bit. The long-term memory, it's actually a rag data, because it uses chroma behind the scenes, but you can customize if you want to. The reason why we can do something like that is, on CrewAI, from the day zero, we made sure that when you're defining a task - It was funny, because when we created this back then, a lot of engineers didn't understood and gave us trouble because of it. When you're creating a task for an agent, you have to not only describe the task, but you have to say what is the expected output. A lot of engineers got mad at us, because of that. They're like, "Well, I'm already given a task description. Why do I need to say what is the expected output?" What they didn't realize is because, one, we're forcing you to do better prompting, even if you can tell. If you haven't been doing much prompting up to this point, this is a forcing function for you to think like you're supposed to think in running the prompt.

Then, we can programmatically use an LLM as a judge to compare your actual output with what you said the expected output would be. Then we can extract learnings from the discrepancies in there, and then we can save that into a vector database. Then your agents can query those learnings during execution to learn from what they have done wrong in the past so they can correct. A lot of the long-term memory is memory that doesn't apply for that specific run that you're doing for agents with that specific data that you're putting in through that. It's more about the way they're supposed to behave and thinks that they haven't complied with in the past that you expect them to, so they start complying with that over time. The long-term memory is injecting extra context with learnings from the discrepancies between what you expected to give in the past and what they actually gave you, but everything happens automatically.

[0:25:12] SF: In terms of learning, are people actually training agents in the same way that you train, fine tune in LLM? Or are you primarily relying on this prompt augmentation to provide the right context?

[0:25:26] JM: Both. We do have a training feature that is basically similar to DSPy in a way. It automatically tunes your prompt to be the optimal that it could be, so you don't have to worry about it. It does in a way where it's conversational. The agent does part of the work and they come back to you and say like, "Hey, this is how I'm doing it. Do you like it?" You can give it feedback. Then throughout that process of you conversating with the agent, it basically updates its prompt to make it better. That feature is extremely useful and saves you a lot of prompting engineering, but we also have cases of people fine-tuning models, especially small models. Small models, they suck a lot of times as agents. They're amazing for a lot of different pieces. But as agents, they have a very strong time to complying with specific formats that you expect and things like that. But if you fine-tune them, they become beasts.

We have seen people fine-tuned to have agents that basically output content in a voice, right? Like a company voice in the same way, so it doesn't sound like AI. That's the most common use case that we see out there.

[0:26:43] SF: What about in terms of an agent having access to files and tools? How do you control access? Data governance and controlling access is a hard enough problem in any distributed system and enterprise architecture and stuff like that, because you end up having to have, essentially, different code rules in every piece of software independently. How do you do that with agents?

[0:27:06] JM: Yeah. There's a couple of answers. On our open source, it's all programmatic. You're going to have to figure out these things on your own. On our enterprise offering, you do have features around where you have an internal repository of tools. I think that is even available on our free tier, so if you're listening to this, you can try it out. You can go at crewai.com and sign up. The free tier, I think we already have the two repository. As you view these tools, you can push them into this private repository and you can control access levels based on roles. You can create a specific role, and not only that role will apply for the agents, but also apply for the people.

You can have certain engineers in your team that is going to have access to tools that are going to give them access to SAP data, or to Salesforce data, or to Zendesk data. You're going to have other engineers that won't have access to that. Meaning that they won't be able to even see it, and they're going to be able to use those tools when building their agents, and their agents won't be able to use it themselves. Now, that I think is just a tip of the iceberg. I think when you're talking about the actual agents having the permissions, and I think things got to start to get very interesting. I was talking with folks from Okta a few weeks ago. They're doing some amazing work, exploratory work on AI agents permissioning and authentication. Because you start thinking about it and you start having all these questions.

For example, does an agent's permissions change depending on the person that is triggering the agent? If I have extra permissions, I have a lot of access, should the agent that I execute have the same level of access? If so, what happens to the logs that are being generated? How I guarantee that this is not exposing PII and personal information? We take a bunch of that on the enterprise. We have PII sanitizer and a bunch of other things. I think that's a big topic, and that is just the tip of the iceberg. There's a lot of more that we're going to start to encountering as we're ruling out those things at large scale in the wild.

[0:29:06] SF: Yeah, absolutely. I mean, if you have some financial analyst agent that is available to your employees for some reason, well, the person on customer support probably should have different level of permissions into the financials of the business than perhaps, the CEO, or the CFO.

[0:29:24] JM: Exactly, exactly. We were seeing that firsthand in a lot of companies out there. I think, again, right now, I think it's early days for agents. Even though the team is so hot, it's a brand-new industry. It's very early days. A lot of the times, it's all about permissions of access, is all about approving certain tools usage. I think as we get more into impersonating agents, I think things are going to become more interesting.

[0:29:50] SF: Yeah. Given that we're in the early days, how do you think about the maturity curve for companies being ready to do adopt agents? It's hard enough to do any AI application at well at scale. Then singular agents, and we're talking about not even singular agents, but multi-agent, does multi-agent end up compounding the problems that you run into with a singular agent?

[0:30:13] JM: I got to say, it seems saying to see how fast the few companies are moving. Honestly, and I think that what is happening there, I think there's one, there's some FOMO that's driving the market, right? No one's going to be having their competitors eat their lunch because they're being more efficient. I think the other thing is a lot of people like to compare AI agents with the Internet back in the day and Internet boom. I think that's a fair comparison, a lot of different ways. There's one thing that is a major difference though. Then I think it's driving a lot of the adoption and the eagerness to adopt agents. That is, if you were on the Internet on day zero, that we have zero back in their bottom line. No one was online. That wouldn't help your business in a way, or shape, or form.

If AI agents in what if people are getting, or cutting the wind off is with this public companies, they have to file the reports. You start to see reports from companies like Walmart, and you see how many millions of dollars they are saving, because one quarter ago, they implemented agent. They put AI in their case. I don't know if they're using AI agents or not yet, but you start to say like, "All right, so this how actual bottom-line impact." I think that is driving a lot of the fast adoption that we are seeing in some customers. That said, a lot of times those teams are heavily technical still. What we're seeing is even though you have companies like Microsoft pushing for very non-technical folks to good agents, but I strongly believe will be the way that agents will go for the long term.

Right now, the teams that are being most successful in these companies are technical teams. You need to have some kind of sponsorship on the technical side to help you build those very custom use cases that apply to your company. That leads to more success for these companies, they're being a little more eager. For known technical teams to drive adoption like this, I think it's a still very, very early for that to happen.

[0:32:10] SF: I totally agree that I think there is a really big difference between early days of the Internet and what we're seeing in AI right now in terms of the business risk. It's hard to think now, because the Internet has become what it's become. Back in the 90s, it wasn't clear that people could actually make money doing business online, because one, there was not that many people online. Then two, people just didn't know if I take my store front and put it online. Does that suddenly lead to money? If you can have an agent that can reduce cart abandonment from when people are shopping by 5%, or something like that, well, clearly that has a tremendous value for your business. I think there's inherently less business risk if you can get these things working.

[0:32:53] JM: A thousand percent. I think, honestly, the combination of this idea of like, all right, this is able to actually drive impact from a next quarter together with this FOMO of like, all right, my competitor might be doing something. I think that is driving a lot of this eagerness to adopt. Yeah, as those projects, honestly, as I said, depend a lot on the use case. A lot of companies that come to us, we usually start with lower precision use cases, things like, backups, automation, sales, marketing, maybe some support. Then we start working our way up to more high-precision use cases, like user facing, pricing, accounting, things like that.

We do have get some crazy wild cards sometimes, people that are coming in hot, like first time doing it. they want to conquer the world with things. We have some of those customers as well. Usually, people start with low precision use cases and get comfortable within, and then they start scaling from there.

[0:33:52] SF: What's the most sophisticated use case in your opinion that you've seen?

[0:33:57] JM: The most sophisticated use case that I have seen. Well, I got to say, there are things that I never expected to see people do. That was a big Fortune 500 consuming firm working for another Fortune 500 can like, media company. They were using CrewAI to mimic video and audio editors. As the media companies was streaming live sports feed, they had agents that were using fine tune video and audio models to track the ball on the screen, automatically cut it, its sound over and then push that as social media content. That was something that, I mean, I was not expecting to see anytime soon. Again, a very complex use case, but more an unusual side.

Another one that comes to mind that is a more complex one is filling out IRS docs. A very complex use case. Honestly, I thought that making my taxes suck. Now that I know what some of these banks have to go through, it's insane. They have sometimes too few forms that are 70 pages long of questions. Don't worry, they come with a manual, but the manual has 620 pages. How do you read that manual and you fill that up? Using agents to do something like that is another use case that comes to mind that was tricky at first, but was very interesting once that we got it running.

[0:35:38] SF: Yeah. I think filling out RFPs is probably going to be a big, I'm sure companies already doing that. Then I know for sure that there's a bunch of companies that in the bio informatic drug design space that are using, I don't know if they're technically agents, but at least AI to help fill out some of the forms. Because there's just a lot of forms you have to fill out in order to go through the legal channels to get a drug to a place where you can test it. I'm sure there's human expertise checking those things over, but there's just a lot of work that you have to put into filling out these forms. It slows down and every second costs you money when it comes to delivering a drug to market.

[0:36:13] JM: Yeah. The thing is these forms, it's not like you can have anyone fill them, right? It needs to be someone that it is very much an expert on their field to make sure that they're doing it right. Yeah, those things can be very frustrating for sure. I think the other case that comes to mind now is a lot of overlaps with factories. People that are building. We work with a pharma company on use case like this. This one is not running in production just yet, but it's very interesting, where agents are monitoring the sensors from the machines.

Again, you could have just regular AI do this, but where the agent's part comes in is if they pick up something that is odd about the data, they automatically cross that live with FDA databases, and then they launch an internal investigation and what might be happening here. There's agents produce a full report, that then goes into the people that are actually is going to check the batch of that of production. Yeah, very interesting use cases all around. I think it's going to be incredible. 2025 and 2026, definitely going to be very exciting years for AI agents.

[0:37:19] SF: When it comes to designing an agent, there's all these different agent design patterns that exist, like reflection pattern, React and so forth. How do people make decisions about what pattern makes sense for the problem that they're trying to solve?

[0:37:33] JM: Yeah. I got to say, I think if you're building something from scratch, you'll find people a little more curious to experiment with other different patterns and see what happens and how they feel architectures that you can go about things. What we're finding is that a lot of these use cases customers want to focus on the end result. Does this produce the end result that I need? Is this cost efficient and time efficient?

I think time efficiency is not as big of a problem with agents as we have seen with other AI applications, just because people assume that those things are going to take a while to run anyway. Cost efficiency was a big problem if you go back a year ago. Now with a race the bottom with the LLM prices, not as much. Usually, people start with React, just because that's they go to and works the better. But one thing that we do internally is we map all the papers that are coming out and we implement a lot of features based on some of these papers. We implemented a new feature recently around hallucination checking using LLMs as judges as well. That was entirely based on the paper for a big university, but very cool paper as well.

Honestly, I think as AI is getting more and more mainstream, you're getting more to the folks, they're not necessarily reading the papers. They want to get to value as fast as they can to actually drive value within their companies. If you really want to see all the cool, interesting things, you got to be tuned into that.

[0:38:59] SF: Yeah. If I'm using Crew, then I don't need to be thinking necessarily about this. This stuff's all abstracted away for me and Crew is going to go and essentially execute whatever agentic patterns makes sense to solve the problem.

[0:39:10] JM: Yeah, exactly. I mean, again, you can change, you can go deeper level if you want to and can customize a lot of different things. You can customize the inner prompt. You can customize if you want to have a hierarchical agents or not. You can customize the templates and everything. Yeah, in Crew, a lot of those decisions, if you just want to get going, they're already made for you.

[0:39:31] SF: What is the biggest challenge in getting enterprises to adopt agent technology now?

[0:39:36] JM: I got to say, I think it's very much early days as we're talking before. It's interesting, because education becomes intertwined with selling to some expense. Especially when they're talking about enterprises, as you ask, automation is almost a consultative process. There's no process that is equal between those bigger companies. It's very a more hands-on process on understanding their needs and making sure that they are getting the value that they want, while helping them to get educated on evaluation and what it means evaluations and how they should be thinking about that. I know that you probably spend a lot of time on that yourself, just educating people on all of this and how they should be thinking about this.

I would say, it's not necessarily a challenge, but it's definitely different from what other sales motions that I had in the past, where educational content is not as much intertwined with the selling process as you would. Then the other thing is just making sure that you're measuring results. You want to make sure that you're tracking ROI. That's another thing that you want to make sure that customers understand from the get-go. You're focusing on practical use cases. I hate when people come up to us saying like, hey, why should I be using this for? Well, what are the use cases I should be using it for?

We have helped with some customers on that route and they became customers and that was great. At the end of the day, the better ones are the ones that we have a clear need. We might not know how we're going to do it, but we want to help with this. I would say, if you're thinking about using AI agents, be an adopter and don't wait for other people's needs, I really think through how this could benefit you specifically and what are your pain points, and then reach out to Crew.

[0:41:20] SF: What about the technical challenges? What are some of the things that people need to be aware of there?

[0:41:24] JM: Well, data, right? Data is the big thing. The end of the day, these LLMs are engines. They need the proper data for them to be working. I think integrations is a big one. We have a bunch of integrations that we have built ourselves that usually how people get started. There's always that internal system that no one touches for seven years and you need to get data out of it. It's like a random API. I would say, the integrations, as one would expect, are still probably the main hinge on getting a lot of those major use cases into production.

Again, if it's something that you already have seen, things like SharePoint, SAP, Salesforce, yes, you can gather them pretty quickly. Now, if it's an internal system, or more specific CRM, then then things get a little more complex. That does slow down a lot of these implementations. Because again, people are eager to get to value. Then now, they need to deploy engineering resources to build integrations that it feels like moving backwards.

[0:42:25] SF: What's next for crew?

[0:42:26] JM: Well, I got to say, I was talking with someone before the new year and people were asking about what 2025 would look like. I got to say, we closed more customers in the last six weeks of the year than in months before it and we closed quite a few. Things are definitely taking a lot of hit. I mean, I signed a deal on December 31st. On December 31st, usually, people are not even working and we're there signing deals with executives on major Fortune 500 companies.

I would say that for CrewAI, the year is going to be either great, or insane. What insane looks like in terms of not only customers and revenue, but actually in terms of growing the team, in terms of growing the company, in terms of maturity of the framework and everything that we're building, I think that's the things that are going to be very interested and very curious to see how they're going to play out. I think 2025 is where we're going to see other major players making a big move, right? Microsoft is doing their thing. Salesforce is definitely doing their thing. ServiceNow is doing their thing. SAP is cooking some stuff. We know there's going to be a lot of different players coming into this. I think if anything, it's going to make everything extra interesting.

[0:43:39] SF: Yes. Well, I'm excited for you. Joe, thanks so much for being here.

[0:43:44] JM: Thank you so much for having me. I really appreciate it. Thank you all for listening. Yeah, if you want to know any more about Crew, or myself, feel free to reach out to me over X, or LinkedIn. Thank you so much, Sean.

[0:43:54] SF: Awesome. Cheers.

[END]