EPISODE 1668

[EPISODE]

[0:00:00] ANNOUNCER: Convex is a serverless backend platform to simplify full-stack application development. Its underlying database is written in Rust and it uses TypeScript to integrate with ReactiveUI frameworks. The platform is growing, which has presented new reasons to make the code open source and Convex recently released the source code for a self-managed version of their platform. The question of whether or not to open source is one that many companies consider.

We were curious to explore the decision-making landscape around open sourcing, and today, are speaking with James Cowling, the co-founder and CTO at Convex. James joins the show to talk about prioritizing developer experience, the choice to open source, risks of open sourcing, software licenses, and much more.

Gregor Vand is a security-focused technologist and is the founder and CTO of Mailpass. Previously, Gregor was a CTO across cybersecurity, cyber insurance, and general software engineering companies. He has been based in Asia Pacific for almost a decade and can be found via his profile at vand.hk.

[INTERIVEW]

[0:01:18] GV: Hi, James. Welcome back to Software Engineering Daily.

[0:01:21] JC: Thanks, Gregor. It's really nice to be here again.

[0:01:24] GV: Yes. So, I believe James, we've had you on Software Engineering Daily, I believe, twice already. Is that right?

[0:01:30] JC: Three times, actually. I gave a podcast way back in the Dropbox days talking about building the storage system at Dropbox. So, that was same founding team at Convex who built a lot of infrastructure at Dropbox. So, in a past life, I was also on the podcast.

[0:01:43] GV: There we go. So, you've semi-answered it. But could you, just for those that maybe are - just joining us new or have forgotten, maybe just give a quick introduction, kind of what led you into founding Convex?

[0:01:56] JC: Absolutely. So yes, like I said, we were at Dropbox for approximately forever, and a lot of the large-scale infrastructure there. As a team, we've done a lot of - when I said large, I mean, like in a multi-exabyte storage systems and distributed databases, that process several million transactions per second. Very large scale.

When we left Dropbox, we kind of promised ourselves, we'd never build a database ever again, because I have a bit of like a pet peeve about companies where infra people take core technology and turn it into a startup without really thinking much about who was benefiting? Does the world need this new complexity? Do we need to be doing things like Google? Or when people are on day one of a product they're building, should they be doing things quite differently?

So, we went down the path of various startup ideas and building stuff, and I think we were a little bit dismayed to find that the landscape for building products, I thought had gotten kind of worse, in many respects than it had been 10 years earlier when we were more product people. When we're building web apps and things. It felt frustrating that folks were getting told on day one, "Hey, you should be going deploying a Kubernetes cluster." Where Kubernetes, 10 years earlier, was only the domain of multibillion-dollar or multitrillion-dollar companies now. And even stuff like modern developments, this happened since the founding of Convex, but modern developers like React server components where developers are really thinking about the data rendered on the server kind of somewhat statically, or the data rendered from the client dynamically, or it's on the edge, it has felt like the landscape had really gotten complicated in a way that I don't think was necessarily accelerating developers to do the way I would have hoped technology would have achieved by now.

So, it became clear that like a rethinking of how back ends work was necessary. I think back ends when it's not providing the right abstractions to developers to move fast and build stuff and make problems go away. We developed Convex. There's a lot of very smart people working on software development, or like, there's a ton of really smart people working on, say, web dev frameworks. There's a ton of really smart people working on databases. I think the real problem is that some of these problems require a bit of a holistic perspective on this. I think some of these problems require thinking about how the database is designed and how the developer uses it at the same time. That was kind of the central thesis to Convex was to take an approach to building a back end as a service for actual product developers that can actually really accelerate people. That did. That required us building, yes, a development libraries for front-end development, but also for building our own database. So, we violated our rule to never build a database again. But here we are.

[0:04:35] GV: Yes. So, Convex, it is the, I guess, proverbial kitchen sink. It is everything. Is that right? 

[0:04:41] JC: Convex is the backend-as-a-service. So yes, you can think about it as the tool for a full-stack developer. If you want to build an application, and you're good at the say, you're a React developer, and you're really good at building React apps. You have your React, and then you have Convex and that's the end of the story. There's hosting, obviously, using - it's hosting the actual static content of your website somewhere. But Convex is the database. It's the compute. It's the scheduling and the crons and the storage, and the vector search, and all these things that people need for applications. A lot of folks are building AI applications on Convex now as well.

It's a bit of everything. But it's designed to not feel like that. I think that's very important distinction. Maybe one way of thinking about Convex is it occupies the same space as platforms like Firebase and Supabase. I have a lot of respect for these platforms. Firebase, in particular, for really kind of raising the bar for what people expect from like an out-of-the-box manage service. But I just don't think it's gotten all the way there for what developers need. 

So, yes, Convex fits the same space. It's a database in the cloud and there's some similarities, but there's a lot of very big differences. I think one big difference is type safety, end-to-end type safety and relational data model, and all the things that kind of you need to build like a really industrial strength, scalable, back end. But I think there's a very, very big conceptual difference, which is in Convex your queries are code. Convex is, I don't know - I don't think we've settled on the grandiose phrase for this yet. But as you know, you could think of it as a programmable database in some respects. But that sounds scary.

What Convex, really is, is you're a product developer, you building front-end code. The queries you write a TypeScript programs. It's TypeScript to JavaScript. You write a full program, hey, read from this table, look up the results from over here, make a decision, write back, do whatever. Your real query as a TypeScript program, and it runs as a single atomic transaction on the database. So, there's no thinking about race conditions. There's no thinking about the client in the loop and high latency as a result, and rendering waterfalls that come up from these multiple rounds of communication. There's no worries about exposing your data model directly to end clients, because you have your end clients talking directly to the table that is exposed to the world like you might see in Firebase or Supabase or GraphQL, for that matter.

Once you have this concept of you're running real functions inside the database as transactions, this is new superpower that it unlocks, which is Convex knows what those functions are doing. So, if you do a query in Convex, we know exactly what data that query read. And we can then automatically cache that transaction for you. There's no such concept of manual caching in Convex. We know exactly, like perfectly, what data is cacheable consistently and we just serve it to you fast. You don't have to think about what's cached and what's not. You don't have to put Memcached in front of this thing. You don't have to worry about how to reconcile or invalidate caches.

So, this one kind of superpower that's unlocked is when you're running code in the database, the database can just cache things for you automatically and perfectly. And the next extension to that, and there's a lot one can talk about Convex and I'll stop talking in a second. One extension or one kind of corollary of caching is subscriptions. So, if you execute a query in Convex, that query says, "I don't know. Give me all the items in my shopping cart. Or give me all the unfinished to-do items. Or give me all the chat messages in this chat." Some kind of semi-complex query that you're running. We know when that query changes. You can execute the query once and just subscribe to it.

If we have a client-side development library. So, you can just bind the result of that query to a hook in React, and then anytime the data changes on the server, we'll just dynamically update the client components for you, that the components on the client application, so you can build dynamic applications with no polling code, no concerns about end to end consistency. You just know the data on the client, the components on the client are consistently updated automatically.

There's this many more things I can talk about Convex. But I think the real key kind of ethos and philosophy is, let's take a step back and rethink how databases and platforms should be developed. There's all this amazing history in developing programming languages and abstraction and information hiding, how to have clean perimeters that fit together really, really, really well. How do we apply that to back ends? So, yes, Convex is running functions on databases subscribed to them. That's it. That's the end of the story.

Yes. It's a whole bunch of stuff. There's a whole bunch of features you can use, just like in every other platform. But I think the real key thing is they just fit together really well. The whole point at the end of the day, is to allow developers to move fast, build real scalable systems.

[0:09:37] GV: So yes, I had a couple of questions. Maybe the one that hadn't maybe quite sunk in with me, if I'm being totally honest, but it is a question, is that Convex is itself its own database. Is that correct?

[0:09:52] JC: Absolutely. Yes. Convex is a database.

[0:09:54] GV: Yes. Because for example, Supabase is very sort of, they fly the flag or we are layer on top Postgres. That is what we are. So, I think I haven't quite fully yet understood the Convex, is its own database and that's pretty fascinating.

[0:10:09] JC: Convex, the backend is written in Rust. We do our own kind of transaction coordination. We use our own kind of timestamp Oracle to assign timestamps that we do our an optimistic concurrency control and multi-version concurrency control, and other fancy database stuff that you hear about.

Now, at the very bottom of the stack in Convex, we write that data currently to kind of distributed write-ahead log, which is actually stored in RDS, in Amazon right now. Eventually, we'll build our own thing for efficiency purposes. But the actual database itself, it's a real ground-up database, and I think that's a necessary ingredient. I don't think everyone who's come before has been stupid. There's a lot of smart people who have tried to solve this problem. But I just don't think you can solve it adequately, unless you really have control of the front-end development library that the React hooks, for example, the WebSocket protocol, and the database itself are working together.

[0:11:06] GV: Yes. That makes a lot of sense, and will be very pertinent to what we will go into very shortly, which is quite a big announcement that Convex is going open source, and it's going to form a lot of what we talk about. One second question, this is actually, I'm just going to ask more of a personal level. I'm sorry, listeners, if you hate this framework, and you're like, "Gregor, why are you asking me about this?" Meteor. Do you remember Meteor?

[0:11:28] JC: I do remember Meteor.

[0:11:30] GV: Very different conceptually, but at the same time, I feel for its time, it was almost trying to kind of move the needle in this direction. Did that in any way influence Convex or not?

[0:11:42] JC: So, Meteor and App Engine are these two platforms that I think were really cool, and also, ahead of their time. I think they were ahead of the time in the way that they just kind of weren't there in terms - I guess, App Engine is still around and Meteor is still around. But they were ahead of their time, insofar as that they just didn't quite get all the way to the finish line in terms of being the platform people can use to build real applications and not feel like you are contorting yourself in strange ways and not feel too foreign and all that kind of stuff.

But trailblazing, I think, I'm not sure whether Convex was influenced by media. I think Convex's thinking was influenced by App Engine, to some extent. It's so easy in hindsight to think things are obvious when they weren't at all obvious at the time. And to have folks come along and say, "Hey, you shouldn't be using a whole bunch of infrastructure tools. You should just use a platform that solves these problems for you." I think that was really impressive stuff. I'm not sure how much direct media or DNA is in Convex, but certainly, there's a lot of similarities. Again, like I think Meteor, just a great team. Cool product ahead of its time. And the time is now frankly. I think the time is now, one, because people are moving faster, wanting to get products out faster. They're realizing the gaps in their current products.

In everyone we talked to about Firebase, for example, they love Firebase. No one hates Firebase. It's just like, "Oh, I'm just going to grow out of it." Like, "Oh, I like using it until I didn't." So, I think that time is now especially when you start seeing all these movements around the edge, and how do we accelerate applications and data placement and developers. I have like a ton of respect for the, whether you call it the Jamstack movement, or the serverless movement, or the Netlify or Vercel, because they took what was a really kind of cumbersome, annoying barrier to development, which was deployment workflows, building in a cloud, distributed CD ends, hosting. They just did a really amazing job of it. They're really great platforms and very ergonomic.

But as larger customers start coming along with larger demands, that kind of level of complexity and sophistication you need to engage in to use these platforms properly starts increasing, right? You see this, especially, with the new stuff coming out of whether you say it's coming out of React or Vercel. The server components, like I mentioned, and this company is pushing an edge databases and stuff. I just think that we've made a bunch of problems like managing, caching, and hosting go away. Now, we're introducing some new problems like, distributed data sync and cache invalidation. There's tremendously complicated problems, right? All of a sudden, like life of a developer got easier, and easier and easier, and then bam, all of a sudden, it's getting harder. I think, there's just, at certain points, is a limit to how much you can make problems go away before rethinking how these development models work. So, I feel very strongly that if you're using Convex, you shouldn't do your own caching. You shouldn't do your own polling. These should be taken care of by the platform, so you don't have to think about this yourself.

[0:14:45] GV: I think that's a very good point. Yes, my Meteor question, I totally agree. It was ahead of its time. I had a lot of fun with it at the time. It did a ton of things by magic that real-time subscriptions was one that came to mind, especially, when you mentioned Convex subscriptions. But yes, it then missed a whole ton of other things and many other things evolved outside of it. Just Node itself being one of them evolved at a far faster pace than Meteor could ever keep up with. Same went for MongoDB, which is, again, what it was based upon. Unfortunately, the framework was too disconnected from the underlying technology, and there we go. So, with Convex, especially the database not being another technology. It is in-house, so to speak, I think that must be a pretty powerful thing that has been brought to the table.

[0:15:34] JC: The best feedback we get from customers, because I'm obviously going to be in this podcast and I'm going to talk about subscriptions and caching and new paradigms, blah, blah, blah, right? That might be why someone chooses to use Convex initially. But when we survey our developers, we have quite a lot of customers building real applications on Convex. Thousands of applications on Convex. We asked them, like, "Hey, why do you use Convex?" They just say, "Oh, it's like the DX. I don't know. I just move faster." And that's the right answer. That's the right answer. If I'm going to say like, "I use convex because it uses optimistic concurrency control and snapshot isolation. But it's actually serializable." That's a weird reason to use a platform. Right? But they use Convex because they just get to move fast and not think about it. Great. That's the perfect answer.

[0:16:18] GV: Yes. For example, Google Cloud, I love Cloud Run, and people try and ask me, "Why do you use Cloud Run? Why don't you this or that?" It's the DX. It's very simple. It is a feeling I get when I use that product. It is simple, easy to use. Again, if that's what you get with Convex, then it makes perfect sense. It's not being able to kind of one-up your developer friend on what little part of. Convex is the thing that's better, faster. It does what I need to do, and it feels great to use, and what more can

I ask for?

[0:16:50] JC: Yes. We've kind of iterated on our messaging over time. Because there is a lot of pretty serious tech inside of Convex that makes it work. But then, at the end of the day, it's like, "Hey, developers should be just focusing on their unique value add." For the vast majority of companies, the value add is the product they've built. We see this with AI apps as well. So, I had this belief and I'll kind of stand behind this, that for the vast, vast majority of AI applications, the hard part of the app is just building the app. It's not the AI. Most AI application these days, they're just talking to OpenAI or talking to Llama, or whatever platform you're using that's hosting some off-the-shelf model, and the hard part is like connecting it all together. How do you have an API? Or how do you schedule background jobs? How do you like render content to the user? That's the hard part, right?

So, you have folks out there building AI applications, but all they want is to solve what's unique. What's unique about the app isn't that they're using ChatGPT. Is that they've built a cool app around it. So, that's the part that I like to kind of focus on. It's like, "Hey, use Convex, and then you can spend all your time focusing on the part that differentiates your app, which is the app itself." Very few companies succeed, because they, "Hey, the consensus protocol is better than your consensus protocol." Those aren't differentiators. Unless you're building an info product like us.

[0:18:17] GV: Absolutely. We're going to move to the topic of the day in maybe 30 seconds. I think it might just also be helpful to kind of set the scene before we jump into the open source. For developers using Convex today, what is the difference between, say, a free and a paid tier today on a Convex?

[0:18:36] JC: Yes. So, Convex is a freemium product, so it's the free product, you can use it, you can host real applications. Many people do host real companies that make money on the free plan. The limits are, one, you can only have two developers on a team on the free plan. And there's some usage limits, that are, they're pretty generous in most applications. But some folks who are going to kind of spill through those uses limits. So, pro is the next plan. We're actually going to rejigger our plans soon and make them a little bit more compatible with kind of early-stage startups. But, basically, pro is the plan in which like, right now, 25 a month for developers and with a whole bunch of usage bundled in. 

So, the real difference right now, some folks they're going to want some extensions, like Fivetran streaming in and out, because Convex exist within like a bigger organization that they manage. But people are upgrading for the developer efficiency, for having more people on the team, and having high usage.

[0:19:30] GV: Yes. Great. Okay. So, highlight of the day. Convex is open-sourcing. That's the sort of the headline message here. We've got to ask, why? The first big question there. Why is Convex open-sourcing?

[0:19:43] JC: Yes. So hopefully, this part of the conversation is going to be interesting to folks because this is a platform that is currently closed source. At least the back end. Our client libraries have always been open-source because you have to run them on your client, so we need to open-source those for you. We're taking some code, a lot of code that we went and built and we paid a lot of expensive software engineers to develop, and we're now going to give it away.

So, why would we want to do that? It's not because the business is struggling, because it's certainly not. I think, the few motivations. One motivation is Convex is quite a new paradigm for how to think about developing applications. And having it - the back end is a black box. It can be a little bit of a barrier to folks in terms of trust in. So, being able to see the source code, being able to fork it and make changes and do what you will, I think just gives people a little bit more trust, when there's a new way of doing, there's a new way of thinking about things.

Obviously, when people - what we deeply understand is when people use Convex to build their businesses, they're putting out their business in the hands of us. And that's something we take very seriously. The second just kind of pragmatic issue is it's a lot of bigger customers want to do stuff like run localized testing. They want to like load large datasets. They want to do a lot of automated release tests and stuff. They want to have the source code that do that. They want to be able to have Convex running locally in the cluster.

To be honest, those are the main reasons. Some people would argue, one would open source Convex because we want to solicit free development from the community. That's not what we're doing. Convex is an opinionated platform that has a very kind of cohesive and very intentional strategy around what we build when we build. We're open-sourcing Convex so people can use the platform and trust it.

Now, there's also an added demographic, I guess, we'll be reaching. We're just folks who don't want to pay for something. I'm frankly, really happy for those people to be using open-source Convex. Now, there's a lot of downsides that come with hosting something yourself. So, one caveat, I should say, at the very start, the thing we're open-sourcing is the single-node version of Convex. We're open-sourcing the version that runs on a single machine, stores data locally on that machine. We're not open-sourcing this big distributed system that we manage, partly because it's our secret sauce, I guess. But the real reason is, there's just a lot of machinery that can't easily be outsourced for someone else to manage. We have a huge amount of complexity required to managing a system at scale for many, many tens of thousands of instances.

So, we're putting out this open-source version that, "Hey, you can run it." But there's a few caveats that come with running something by yourself. You have to make sure that machine is reliable. You have to deal with like routing and your traffic layer. You have to deal with database replication and backups. You also have to deal with migrations. Future versions might come out, and hey, you might have to kind of backup and restore, export, and reimport to deal with migrations. And these are the things you pay for. So, the reality is, I expect most customers will not want to run the open source version in production, because it's worth it to their business to pay someone else to do this. Certainly, if I was on the customer side, I'd be paying someone to do this. We often are paying software as a service that make problems go away for us.

If you're paying for managed Convex, you have something that automatically scales. It's reliable. It has support, handle these things for you. But there's definitely many folks who use this to run applications for free and that's just great, and they're probably not the people who are going to pay for Convex anyway. So, that's fine with me.

[0:23:25] GV: Yes. So, this is very interesting, and maybe if I could just sort of read back to you, it does feel like this is less about making Convex free for people. It's great that people who don't want to or let's go more with cannot afford to pay for Convex, this opens up some kind of cool opportunities for them. But it maybe feels like there's more on the side of the code being available. There are other benefits to that. It's not about someone who's able to take it and run it for free. There's actually a whole ton of other benefits with the act of open-sourcing, for example, like security. I mean, that's the area that I look at the most. Just by definition, by open sourcing, security tends to improve, because people see things and there's no kind of way to kind of hide it, so to speak in a closed source. But I don't know. Has that been a consideration? What other considerations around the code simply being visible now, would you say are like the benefits, or why Convex is doing this now?

[0:24:26] JC: Yes. I mean, I think, for example, the security argument. Look, I'm looking forward to people being able to see the source code and then to be able to trust it. I don't see that as a means to improve our security because at the end of the day, that's my job is to ensure that Convex is secure by itself. My job is to pay people who are really, really skilled to actually go and build a good platform.

So, I don't think there's like, well, I wouldn't call it lazy. But I don't [inaudible 0:24:48], okay, let's throw it out and have the community improve this thing. No. It's our job to make something high-quality for people to use. But I do think a lot of trust comes from putting stuff out into the community. I do think that cool stuff will get built with Convex out there. People will be able to go in and make changes and do what they want. And again, like a lot of companies very excited about this because they can use it for their local testing and for doing all that kind of internal stuff. Hey, if there's folks out there who want to use Convex open source and run the program, hey, that's great. If that gets them excited, and talking about Convex on wherever they talk about, Hacker News or whatever, then that's awesome.

There's a mindset that comes around open source. I'm talking about like open source as if it's a no-brainer right now. But I talk to a lot of other companies when doing this, a lot of founder friends and thinking through, "Hey, this is our secret sauce. There's some pretty serious stuff in it. There's some pretty serious technology in there, that now open-sourcing people can see."

What are the concerns and considerations? I think everyone's afraid of repeating the Elastic story. I'm not sure if you're familiar with the story of Elasticsearch. I'm going to get this slightly wrong. But essentially, it boils down to Amazon, AWS, forking Elastic, and launching Elasticsearch and cannibalizing the business, frankly. I'm not making any legal accusations here. I think everything was done legally. But at the end of the day, if you're putting your source code out there, there's this, I think, it's referred to as the harmful free rider problem. Some kind of free rider problem is what they're talking about. The harmful free-riding problem people talk about in the open source kind of world.

So specifically, we're licensing Convex under a license called FSL. It's the Functional Source License. And which, by the way, is not according to the Free Software Foundation, technically open source, because there's some restrictions on its use. And what that FSL says, it's very, very similar to the business source license, which came before it. It's kind of a simpler version. There's no kind of conditions attached to it. There's no per-package kind of implementation details. It's just a very straightforward license that says, "For two years after source codes release, you cannot use that source code to directly compete against Convex." So, that means you can't take our source code and go resell it, as you know, Bonvex, and it's exactly the same source code. You can go and use it and run your company on it. That's fine. You can do anything you want with it. It's just not directly competing with us as a company.

After two years, that becomes Apache 2. I like Apache 2 because it's very unrestrictive. There's no like GPL or whatever you want. Basically, yes, after two years, do whatever the hell you want with it. It's your source code to use, within whatever the licensing requirements of Apache 2, which are very permissive.

So, there's a little caveat there, which is that yes, we have this minor protection, which says people can't directly compete with Convex for two years. Now, the reality is, two years for us is a lot of time. The FSL license came out of Sentry. I know the folks over at Sentry very well. Spoken to David Kramer a bunch about this. Yes, for fast-moving businesses like ours, two years is an eternity. Typically, with a BSL, the Business Source License, the restriction is four years. It's four years non-compete. That felt a bit too long. Four years can somewhat be described as an eternity in startup land. But yes, we innovate fast. We have new features coming out every day. So, two years is more than enough time, because Convex should always be doing a lot of interesting stuff in two years. If not, then, hey, that's on us.

[0:28:28] GV: Are there any other companies, especially, listeners might be familiar with that have gotten this exact route already, with that exact license, and gone through the two-year non-compete? Did you follow any companies kind of previously? Or does this feel quite a bit of a new idea to do at this way?

[0:28:46] JC: It's a little new. It's a little bit new. BSL, Business Source License has been around for a while. Like I said, there FSL license came out of Sentry. So, Sentry is the exception logging, exception reporting platform, very, very popular. They're FSL licensed. In Sentry's case, they've gone through multiple rounds of licensing. They had some more permissive license and then moved to BSL and then moved to FSL. They had gone from kind of open, to like having some restrictions. Obviously, the community is going to push back and there's always going to be [inaudible 0:29:20]. The open-source boards are pretty opinionated group of folks.

But I think in this case, we're going from close to open, or close to what I consider open, or maybe the Free Software Foundation doesn't consider it 100% open because you can't directly compete with us for two years. But I think, talking to a lot of folks in the industry who have Convex, who are thinking the same way. I mean, at the end of the day, what we want to do is encourage an industry. The competition, might I say like, is the competition for Convex like Supabase or Firebase, in some respects, but also the competition is like legacy ways of doing things. The number one goal for us is to get people excited about building things in a better way, and I think, the best way of achieving that goal and driving kind of interest in the product is to open source, the code base will open up the code base, happy we'll be able to use it in fork and make changes.

[0:30:10] GV: Yes. So, talking of forking, making changes, how does Convex plan to deal with community effectively? Community, community contributions, community feedback, everything community?

[0:30:25] JC: It's interesting you say that because when we thought about the risks of open sourcing, and I'm saying open sourcing in quotation marks. My definition of open sourcing. We talked about the risk of open sourcing, we pretty quickly got past the onous of what if someone sees our code? What if someone steals our code? And then started thinking about like, wait a second, how does this affect our business? What are the risks?

One risk is if someone like misuses it and loses the data somehow, and then makes us look bad? Because they had some forked version of Convex, whatever, and then it just didn't work very well. That's a problem. One risk is, what if it slows us down? What if we spend so much time worrying about what the community is going to think about, and the releases we make, and it slows down our iteration? That's a real problem. One problem is how do we deal with the community, right? As I deal with the community, obviously, the community is the whole point of this. But it takes effort.

So, the things I'm very excited about is, for example, on the client side of Convex, the kind of primary client libraries right now for Convex are in React. If someone goes and is inspired to build, there are libraries out there for Solid, and if someone wants to go build client libraries for Solid, or Svelte, or Vue, what have you, Remix. Great, awesome. There's a really good. I'd love to see. This expands audience for Convex. If someone wants to take Convex and go make some tweaks for their own usage, awesome. Go for it. If someone wants to shoot bugs back to us, yes, even better, right? If someone comes along and says, "Hey, I have changed how Convex works, I think y'all should operate differently with this different syntax." We're probably not going to accept those changes, right?

It's just the reality is, when you have a company, you have responsibility to your customers. We have a really, really talented team. This is not like open sourcing-to get free work done. We have to own our strategic direction. But hey, the community can take this wherever they want. There's also the question about support. At the end of the day, we have to support the people using hosted Convex and managed Convex. But developing community support around this is also a very important thing.

[0:32:38] GV: How does the team feel about this? I can sort of imagine maybe, at least, in maybe the early stages. It will be sort of maybe some of the team feel they've got to kind of keep one eye on the community contributions and one eye on actually what Convex is doing internally moving forwards, especially, I'm coming up with a pretty sort of very hypothetical scenario here. But what if a contribution looks very similar to actually what you guys are doing internally? Could there be any conflicts between someone then putting a hand up and saying, "Hey, like that thing you've just released, well, that looks like what I've been kind of - I've been doing PRs for the last month on this, trying to get you guys to merge this. And suddenly, bang, there's this feature that looks strikingly like it." How's the team thinking about that? Or like, is that something you thought about?

[0:33:23] JC: Yes. Look, we have a contributed license agreement that takes care of a lot of the stuff. If you're contributing PRs back to Convex, they become owned by the Convex project, right? But what I find it amusing is, maybe people don't kind of think about companies this way, right? But companies are just full of regular people who like doing cool stuff, and who are excited about doing good things. So yes, we want convicts to make money because we want to have money to put back into the business and build more cool stuff. But the team is just excited about doing good things, and the team's excited like, if someone uses Convex and says it's great, we all get excited and we follow those messages around on Slack and stuff.

So, I think the team's just - we went through these early stages of like, "Oh, what's this going to mean? How they're going to pull it off? Is there a risk of intellectual property kind of eroding by putting stuff out there?" But I think people are really pretty excited about this. Because at the end the day, we just want folks to be excited about this, too. It feels good to have a platform out there to feel like you're contributing to - if there's someone out there who never could have afforded to run a managed Convex instance, and as a result, they get to build a cool app. That's great. Because at the end of the day, we're just human beings who will feel good about that.

Now, I will say, I think it's extraordinarily unlikely that someone can run Convex locally for cheaper than we can do it. Because I used to kind of oversee a 400 plus million dollar-a-year infra budget and various rosters. We spent a lot of time doing very, very deep infrastructure optimization and how to run things fast with economies of scale and all this kind of stuff. So, I really do sincerely believe. This is not like marketing speak. I sincerely believe that for the vast majority of folks, you can't and shouldn't be trying to save money by running things locally. Just pay someone else to do it. Unless that company has obscene profit margins, and some companies do have obscene profit margins. But unless someone does, you can't do things cheaper yourself, if you value your time above zero. But I think for new use cases, they get empowered, then that's great. I don't think that risks the business at all. I think that's just the strict positive.

[0:35:34] GV: Yes, I completely agree. I think as soon as anyone understand the real effort involved with running any real product on their own infra, the likelihood is a paid tier on that product. Own infra is just - it's slightly about the dollar amounts, but it's certainly a lot more about the time amounts. The time amounts that someone is going to have to spend dealing with this.

[0:35:57] JC: Yes, one of my pet peeves is when I'm on the Internet, and someone says AWS is overpriced. Sure. AWS has the kind of the core features and the kind of value-add stuff. So, they've got their fancy products, and then they've got the core stuff like, EC2 and S3. S3 is not overpriced. I know this, because I've designed and built a multi-exabyte storage system. Yes, you can build a cheaper storage system by cutting some corners and having less replication and using cheaper hard drives, and this and that.

But I can tell you, Dropbox ended up building a cheaper storage system than S3, by a reasonably significant margin. But the economies of scale it took to get there, they kind of directly negotiated deals with hard drive suppliers and capacity teams, and that kind of data center optimization, and less just spending - when we had deployed what was probably around million hard drives, when that system first migrated from S3 to our own storage. Yes, we were doing things cheaper with a million hard drives, as the first company in the world to be using something called SMR storage, which was a new type of hard drive technology.

If you're not doing that, you probably can't do things cheaper, unless you're willing to accept weekly guarantees. Yes, I just think there are some software-as-a-service platforms where you're just falling afoul of a business model, of pricing model that doesn't match your usage. But for the vast majority of use cases, someone else who's an expert at this can do this better and faster and cheaper and more reliable than you can. Isn't that a great thing? Because that's the whole like, standing on the shoulders, kind of argument, right? If you let someone else take care of these problems, and then you can move fast.

[0:37:40] GV: Yes. I think the only supposed success story I've heard of moving away from something like AWS is DHH hey.com, and Basecamp, where they made a big splash saying that they've moved to their own infra and they have MRSK, which is a framework that they devised to run all your own infra and they claim it saving them millions and millions. But there's a key little asterisk there, which is, but you have a team that you employed to do all those shifting and lifting back and et cetera, et cetera. And it's in their best interest, they have two very specific products that they run on this infra. So, devils in the detail and I -

[0:38:18] JC: Yes. I'm real skeptical. Because yes, engineers cost a lot of money. And guess what, if you build something yourself, you're not doing for today. You have to build for tomorrow, too. Because like the platform providers continue to innovate as well. So, it's a moving target. Maybe Basecamp will do an amazing job with this, and maybe it makes sense to them. But for the majority of folks, it's you don't have to build this once, you have to continue to innovate.

[0:38:43] GV: Do you, in any way, wish you had started Convex as an open source?

[0:38:47] JC: No. Because moving fast was very important for us and there's maybe a pride dimension to the work people doing. I think we've always done high-quality work at Convex, but there's a degree of thrashing around and figuring things out and iterating and adding features and taking them away. Convex is long since passed 1.0. So, we have a 1.0 compatibility guarantee. We're not going to break your application if it's built on Convex 1.0. Before then, it was Convex 0. whatever, and we would break it sometimes. We would change features.

I think, the ability to move fast, thrash around, just figure things out, I think was really critical for us. Look, this is still something we think about and there's going to be work happening on Convex internally, there's going to be PRs that the reality is when we do our PRs, there's a public section and a private section, right? The public section goes to the open source repo and the private section is for internal discussions. That are discussions that probably require some context that someone outside the community doesn't have.

So, operating in the open is scary to a lot of folks and it does affect how you work. It's very important to me that this doesn't slow Convex down, that other developers can keep moving fast. They don't worry about what someone will think when they do something. Now, we're at the level of maturity where I don't believe this will slow the business down. But I think previously, it would have. This is the degree of scrappiness early in the startup.

[0:40:17] GV: Yes. That makes a lot of sense. I mean, I guess we see more on the side where a company has started open source from day one. And actually, that was kind of their whole shtick. It's always people saying, "Well, how are you going to monetize? Or what's the business model? How are you going to commercialize this?" Yes, less, at least just from where we see starting closers, and then going open source. But I think it's really exciting and I'm really excited just to see how Convex evolves in this way.

[0:40:43] JC: Thanks. Me too. It's just like, capitalism is a real thing, right? You got to pay for service. You got to pay for engineers. All these things. I'm glad to be in a position now where we have a real business model and we have paying customers and then we can make a very conscious decision to like open source, because we think, by the way, we think it's going to be good for the business. It's not like just pure charity, right? We think open sourcing is going to make people build stuff on Convex and that's good for us, too. But I'm happy to be in this position of strength, and being able to very consciously open source, rather than yet trying to figure out a business model on top of something.

[0:41:17] GV: Is there an exact slated date for the reveal? And just a follow on to that, whenever it does happen? Customers and users, would they expect to see any real differences sort of immediately, or not?

[0:41:30] JC: Current customers of Convex will see no differences. Life will keep continuing just as before. We'll keep shipping just as fast. In fact, we'll probably ship a little bit faster, because we spent a fair bit of time getting Convex open-sourceable. By the way, what I mean by that is like, you have to add new kind of layers into the stack to kind of split out your monitoring framework, and split out all these things, mock out certain parts of the stack. So yes, customers won't see any difference, apart from the fact that we will follow up later on with maybe a better local development story. So, developers can, not using the open source framework where they can just develop locally, when offline, when on a plane, all this kind of stuff will come out. Currently slated for March 12th, although I haven't publicly announced that, so hey, I'm saying it right now, March 12th. So, I guess we better go hit it. That's a week from when we're recording this.

[0:42:16] GV: Very exciting. Actually, can we just rewind very briefly. You just touched on the technicals involved with what you had to kind of go through to open source, and I think that's something quite a few listeners could be interested in, just to kind of get a very brief understanding. You've gone from closed, going to open. What were the kind of key pain points there from the pure technical side? 

[0:42:41] JC: Yes. The big ones are kind of extricating the kind of details of our specific infrastructure from the rest of the platform. As much as we think that Convex is well layered and architected well internally, there's code in there that just assumes our monitoring stack. Code in there that assumes our kind of usage of secrets in production. Code that talks to something called Big Brain, which is an internal service no one knows about, but Big Brain is kind of our customer manager, our metadata system for storing like all our customers and who their backends are, and this and that.

All this stuff had to get split up. Because obviously, customers, users of the open-source platform don't have access to this. Also, just making sure, now Convex has always worked locally on a single machine, because we use it for testing. So, I said earlier, that the very, very bottom of the stack right now in Convex, even though we write our own database, the actual persistence layer, like how the blocks get stored durably is in RDS, on AWS currently. We might change these things in the future. For the open-source version that uses our SQLite backend. So, same database, same transaction, coordination, but it writes in SQLite. There's just there's quite a bit of untangling that has to happen. It's also a bit of a search and replace for any profanity in the codebase. But I think it was pretty clean. I was pleasantly surprised at how clean the codebase was.

[0:44:00] GV: Not too many to-dos. Just sitting around.

[0:44:03] JC: There's not too - well, hey, there's probably some to-do James in there from a long time ago. I have to confess. I'm not a particularly active developer on Convex these days. I'm busy doing kind of managing this stuff. But yes, I'm sure there's a few to-dos floating around in there. So, you can go and find something that I never finished two years ago,

[0:44:20] GV: Anything that just legitimizes the code base. It's not a real code base, unless it has that, especially from the founding team.

[0:44:26] JC: Yes. You need to have a to-do from the birth of the company that just lived there forever.

[0:44:31] GV: Definitely. Well, it has been fascinating to hear all this. I can only wish Convex the best with what is a really brave decision in a really exciting way. So, I really, really can't wait to kind of follow along. I hope we get a chance to, I don't know. Is it episode five would be the next one, if we have you on again? We get to -

[0:44:52] JC: Yes. Hey, I always love being on Software Engineering Daily. I'll definitely be back if I'm invited back.

[0:44:56] GV: I think you must be up there with the top guests. So yes, I hope we can do this again and, yes, I look forward to hearing more.

[0:45:01] JC: Thanks so much, Gregor.

[0:45:03] GV: Thank you.

[END]
SED 1668		Transcript

	(c) 2024 Software Engineering Daily