EPISODE 1886

[INTRODUCTION]

[0:00:00] Announcer: JavaScript has grown far beyond the browser. It now powers millions of backend systems, APIs, and cloud services through Node.js, which is one of the most widely deployed runtimes on the planet. Keeping such a critical piece of infrastructure fast, secure, and stable is a massive engineering challenge, and the work behind it is often invisible. 

Rafael Gonzaga is a Principal Open Source Engineer at NodeSource and a member of the Node.js Technical Steering Committee. He spent years digging into the performance and security layers of Node's core, helping shape the direction of the runtime itself. Raphael joins the show to talk about the state of Node.js performance, how benchmarking really works, the balance between speed and stability, and what it means to contribute to one of the world's most important open-source projects. 

This episode is hosted by Josh Goldberg, an independent full-time open-source developer. Josh works on projects in the TypeScript ecosystem, most notably Typescript ESLint, a powerful static analysis tool set for JavaScript and TypeScript. He is also the author of the O'Reilly Learning TypeScript book, a Microsoft MVP for developer technologies, and a co-founder of SquiggleConf, a conference for excellent web developer tooling. Find Josh on Bluesky, Fosstodon, and.com as Joshua K. Goldberg.

[INTERVIEW]

[0:01:40] JG: With me today is Rafael Gonzaga, Principal Open Source Engineer at NodeSource. Rafael, welcome to Software Engineering Daily. 

[0:01:46] RG: Hello. Thank you. I'm happy to be here. 

[0:01:49] JG: Well, we're excited to have you. You do a lot of great stuff with Node.js, and performance, and user libraries. But before we get into all that, how did you get into coding? 

[0:01:58] RG: Well, it started since I was a teenager. I always been in computer since I was young. I played a lot of games. But my father is blind. And back in the time, he couldn't make a few things on computer because of the disability. And then I started creating a visual or kind of talk back. I tried to do that at least. And then my journey in the computer science started. I tried to make it happen with Python. Python 2, back in the time. I don't remember. I couldn't make it work, but then I learned, "Okay, this is how I could make a calculator. This is how I could make things to work." I started there. But until, now I couldn't make it. But there are plenty of good apps nowadays that he uses. And I'm still helping with some plugins and things like that. 

[0:02:50] JG: Do you have experience then with accessibility technology and writing things for folks who are, for example, blind? 

[0:02:56] RG: To be honest, my focus is on backend. Accessibility on web browsers is not my thing. But I know exactly the problem that it causes, because very often my father calls me, "Okay, Facebook is not working anymore. They changed something. Can you help me?" And then I need to write a Chrome extension that will replace the element to the old way that he's used to get and make that happen. But on frontend, I'm not that person. 

[0:03:27] JG: But you do work in an area that sometimes folks erroneously call frontend. I think we should take a moment here to recognize that. JavaScript is not a purely frontend technology. It's very possible to spend much of your career in, say, Node.js and JavaScript than being entirely backend. What do you think of the areas that your code has touched or the stuff that people use your areas for? 

[0:03:48] RG: Well, I always loved low-level programming. So, I started with Python, but then I suddenly jumped to C and C++. I learned most of the things outside of university. By the way, I haven't completed my degree. So, I don't have degree nowadays. And then during the process, I learned many, many languages. I came from Elixir Erlang. I went to PHP, C#. But I still learn. I'm working on personal project with C++ and C. And then I found that, okay, there's JavaScript people use JavaScript on frontend, but I don't like frontend too much. 

Then I saw, okay, people were running JavaScript in backend. I tried, "Okay, let's see how it works." And my first attempt with Node.js was in a hackathon a long, long time ago. And then I could make an HTTP server, and then I saw, "Okay, that's nice. Let me see how it was created." And then I saw the GitHub repository, and then, "Okay, they use C++. So I could help in that part." Then I started looking to it, but I didn't make the contribution back in the time. But yeah. 

[0:04:55] JG: How did you go from starting to look at Node to becoming a member of the Node technical steering committee? 

[0:05:00] RG: Okay. I have started on Fastify, a native framework. When I was working as a soft engineer in a company that was about to switch all their PHP projects to Node.js, they were looking to a microservice approach. And then I was responsible to make that migration. I was investigating, "Okay, we are about to use Node.js. But let me make some tests to see how well scale this new platform." Then I saw some HTTP frameworks like Express, and a lot of people were advocating about Express. And all the resources you find in the internet, they use Express. But I made some benchmarks, and Express was not fast as I thought it could be. 

And then I was looking, "Okay, let me see if there are other ATP frameworks." Because Express is not official from the Node.js team, right? It's a library. As well was Fastify. And then I tried Fastify, and I saw the benchmarks page from then. And I saw, "Okay, this is nice. The results is exactly what I was expecting." But I found a bug during the process, and I decide, "Okay, let me create that request that shouldn't be achieved good." I wrote the request. It was accepted. And then I saw, "Okay, the community was very welcome." And I decided to keep contributing to Fastify. 

Then I met Matteo, Mattel Collina, one of the Fastify creator, during the process that told me, "Do you want to apply to a position at Nearform?" I said, "Yes." I made the test, which is very curious because something got wrong in the process. And then I did the test for a senior front-end engineer, and I passed. But I haven't worked as a frontend engineer at all. But I passed anyway. 

Then after passing, I got migrated to Mateo's team. And then I got some tickets to work on Node.js core. And one of my first request, Node.js was fixing a memory leak on a Windows 2012 server. 32 bits. And I spent, I don't know, two weeks or three weeks working on that. And it was very hard because I had to SSH into a machine which was far away from me. And I was testing debugging memory leaks, and I could fix that bug. After that, I learned a lot about Node.js core, and then I started doing a lot of stuff. And I got nominated to Node.js core. And then almost three years ago, I got nominated to Node.js TSC. 

[0:07:41] JG: What does it mean for someone to be on the TSC for Node? 

[0:07:45] RG: It's very important to be part or to have a voice in technical discussions. But to be honest, if you are a Node.js collaborator or if you are outside of Node.js team, you can still advocate, you can still share your opinion. It's more about being more proactive on helping new members or discussing new features and having more context about Node.js in general. If someone attempts to include, "Okay, let's reshape Node.js. Let's remove C++ and move that to Rust." Normally, the TSC team are the set of people that will say if that's possible or not. And also in case of discussions, they have a vote, if we should do it or not. It's important, but it shouldn't be something that should guide people to get that role, to be honest. It's more about having a voice. 

I think voice is not the correct term here. Because, technically, the TSC was created just to guide the project. But we shouldn't have more power than any other contributor. That's what I'm trying to say. A TSC member shouldn't be anything different from a Node.js core collaborator. Anything that a core collaborator does, a TSC can do and vice versa. I don't know if I could explain that well. But yeah, that's the idea. 

[0:09:15] JG: It's kind of amorphous being a technical steering committee member on a project that doesn't have one single backing company or one single charter. But your area has typically been performance, and you've written quite a lot about Node performance. How do you see that evolving these days? 

[0:09:30] RG: Currently, I work on Node.js security and performance. Performance, I normally do it because all my studies, my research are in performance. However, I've been paid to work full-time on Node.js to focus on security. I have these two fields. For performance, I have been monitoring Node.js performance for quite a while. I have a lot of projects around that. And recently, I'm leading the performance initiative of Node.js. However, due to a restrict bandwidth, I don't have much time to work on that because I'm not paid full-time on performance, but in security. 

I normally write some reports like the state of Node.js performance. And the Node.js have been evolving a lot, mostly due to V8 improvements. So V8 is doing a very good job in JavaScript. Some performance improvement came from V8 per se. And some of them is because Node.js is migrating most of its API from JavaScript to C++. 

In the past, we had a lot of discussions if we should write APIs in C++ or JavaScript. And the reason that some of them were writing in JavaScript is because it's way easier to get contributors for JavaScript than C++. Okay, some API, let's write it in JavaScript because it's easy to get maintainance. But then we saw, "Okay, for these specific APIs, JavaScript will be a bottleneck. So let's move that to the C++ part of Node.js." Then we moved, and then we saw, "Okay, we got a significant performance improvement." 

And if you compare the Node.js startup, for instance, we got significant improvements, and that applies to Lambda providers if you are using Lambda on AWS or Vercel. Even Cloudflare workers, you see that it's very important for them to have a very fast cold start. But I would say that those new runtimes, like Bun and Deno, they have moved Node.js to get a different point of view in terms of performance. We were more stable in the past, not releasing many features, and being more assertive. 

But now we have switched a bit our focus to hear more about the community. If community wants some modules like SQLite, let's do that. It doesn't need to be so satisfying for maintainers. We have that. And we have a more strict focus on performance. We are actively monitoring benchmarks, microbenchmarks of Node.js. 

And to be honest, it's very hard to write benchmarks in JavaScript in general because of V8. V8 can be very tricky. It's, most of times, smarter than you. And if you are writing a microbenchmark, possibly, you are looking to the wrong metric. It's very hard to make some comparisons because some API might be fast on your machine or in your workload. And with a different workload or running that for more time, it will cause more deoptimizations, optimizations, or garbage collectors. And then you see a different performance. But yeah, I would say that Node.js is way faster than two years ago. 

[0:12:44] JG: I want to walk through a couple of scenarios with you just to help flesh out what you're saying. Do you recall a performance optimization you attempted at Node that did not work out after running all those optimizations and benchmarks? 

[0:12:56] RG: Yes, there are plenty of them, to be honest. At Node, there is a working group called Node.js performance, which stores most of the issues that people find about regressions in Node.js in general. Some of them we were investigating. I believe it was from Node.js 18 to Node.js 20, or on a specific version of Node.js 20, which, maglev, a compiler inside V8 was introduced and enabled by default. 

After that, we have received some issues saying, "Okay, microbenchmark is telling me that my code is 50% slower than in a previous version of Node.js." Then we came up to investigate, and the reason is that maglev, if you run that piece of microbenchmark for a short period of time, maglev was being introduced, and this was causing some slow operations. But turns out that if you run that with a production workload and running with a reasonable amount of time, maglev was being called, but then it was moved to the TurboFan, which is the optimized bytecode of V8. And then the result or the performance of the code was similar. 

For short scripts, that might be a performance bottleneck. But long live ATP servers, the performance was the same. It's very tricky to measure microbenchmarks in JavaScript because you might be measuring a no-operation because V8 optimize your code away. You might be measuring a non-production or an unrealistic workload because you will never run that function many time as you are measuring. And some of them, for instance, in an specific version of V8, if you were converting a string to a integer using parseInt in comparison to a plus signo, the plus signo was, I don't know, 10 times faster than parseInt in some V8 versions.

I have shared that on Twitter. I have a repository that monitors all those micro operations, and I have also wrote about that. And it was fixed by V8. And it turns out that in most of the cases, if you attempt to switch all parsing to plus signo, it will only give you 0.0005 milliseconds of performance improvement. Most of the time, it doesn't worth the change. So it's tricky. 

[0:15:34] JG: One of the open source projects I work on, someone spent - I think it was about 2,500 words trying to explain to us why we needed to switch from one library to another. That would save, don't quote me on this, about 104 bytes of node module size and about a fraction of a hundredth of a hundredth of hundredth of a second of runtime improvement at the beginning. 

But at the same time, as I'm sure you're going to say within the next 10 minutes, sometimes these optimizations really do result in userland improvements, where something that's a fraction of a second faster compounded over time really is better. So how do we know when we're looking at, say, optimizations in Node or userland libraries? When is it worth it to make that performance investment? 

[0:16:13] RG: Okay, it's very important to have baseline. If you are trying to optimize your code, you need to have a benchmark. Most of people that use Node.js, they use as an HTTP server. Measure how your routes are responding or how they are performing. If you are getting, I don't know, 10,000 requests per second. If you are getting a 100,000 requests per second. Measure CPU utilization, measure memory during workload, during a benchmark tool. When a benchmark tool runs like WRK2, or Autocannon, or Apache benchmark. It doesn't really matter. Actually, it matters. And I can explain a bit more later. But create a baseline, "This is the state of art of my HTTP server." Then you start measuring. 

You can't optimize things without measuring things first. Otherwise, you fall into a trick role where you are spending a significant amount of time to get just a small milliseconds of improvement, as you said. But there are a lot of improvements you can do. Mainly, when you are comparing some HTTP frameworks, most of you use Express, because Express is spread everywhere. But what most of you don't know is that if you switch from Express to Fastify, you get a lot of performance. 

I'm not saying that because I'm part of the Fastify team. I'm also part of the Express performance team. And what I can say is that Fastify, in the last versions, it's way faster than in, I don't know, a year ago. Always measure. Fastify is still, as far as I know, the fastest framework in terms of stability and performance out there for Node.js. 

And more importantly, make sure that everything that you write to the logs - logs is a huge problem. A lot of people lose a lot of performance by just choosing the wrong log library. People are using console.log, which is a huge problem. If we're doing that, please stop. People are using, for instance, Winston. I wrote an article called 'The cost of logging', I believe in 2022. And it is still very popular and it's still applicable. 

As far as I know, the fastest logging library out there is Pino JS. The reason for that is that Pino was writing in a way that it can create a queue of messages and will not block the event loop when writing to the terminal. I wrote in depth about that in that article. So I suggest people to check that. It's very easy to find. And Pino JS also uses SonicBoom as a dependency, which is very important to make Pino JS fast. By choosing the right HTTP framework and the right logging library, I'm pretty sure your app will be way faster nowadays. 

[0:19:15] JG: One of the advantages, as I understand it, of writing things in userland, for example, first, Express, and then Fastify, is that allows users to iterate and experiment. But at the end of the day, some of the things you've alluded to, such as console.log being slow, are part of Node core and, therefore, the default that a lot of people go with. What would it take for node itself to have an equivalent to an Express or Fastify, or Pino JS in core? 

[0:19:37] RG: Okay, the Pino Js in core is we are discussing that, to be honest. So that might happen. Yes, a lot of people ask me, "Why simply Node.js folks can rewrite the HTTP module to make it similar to Fastify and fast as Fastify?" And the reason for that is first one is maintainers. Fastify, they have a huge team maintaining that. And for Node.js, if you move things to Node.js, it's expected that you get less contributors working on an specific module because it's way hard to understand noode.js source code than Fastify. 

The second one is that we can't break the world. Node.js is used by tons of devices and even releasing things as major, which is expecting to break. We can't touch very hard on HTTP modules because this will migrate people away from Node.js. If you look to the recent Node.js downloads dashboard, you see that a lot of people are stuck on Node.js 12 just because of breaking change. They don't know how to migrate. It's very hard for them to migrate. And breaking changes are very crucial for them to stick on an end of life version. And this is a problem. Because when you stay in an end of life version of Node.js, you are not safe. There are plenty of vulnerabilities that affect you. 

And I also wrote a talk, I have delivered a talk called 'Five ways you could have a hacked Node.js', where I expose some vulnerabilities that Node.js have fixed. And if you are using an outdated version of Node.js, possibly, this will happen to you. And that's it. I mean, we can make some improvements to HTTP module, but we can't change it too much. How it was designed is crucial and it's legacy. First, is very hard to change. And second, changing that means that you are prone to break a lot of people. 

[0:21:41] JG: You have two conflicting desires and needs here. One is you want to keep things stable, because breaking changes hurt the community's ability to migrate. But you also want to have modern APIs and improvements. How do you strike that balance? How do you know what's a valid or correct breaking change to make? 

[0:21:56] RG: Most of features, they are behind flags, behind enabling flags. For instance, we brought the permission model. Permission model is a feature where when you enable that, the Node.js will restrict access to file system, to network, to V8, to V8 inspector protocol, Child Processes, Worker Threads. All of these will be restricted whenever you pass dash-dash permission when you use Node.js. This is a secured measure. 

And people ask me, "Why not enable that by default?" The reason is that, first, it will break all these scripts out there. People will never be able to migrate. People do not read change logs. They just don't want to read that. Or for some of them, it's hard. Some of them use Node.js behind the scenes. So for instance, if you are using Next.js, you are using Node.js behind the scenes. And in some situations, you don't have the Node.js command out there. It's behind Next run or something like that. This will break frameworks. And frameworks will need to upgrade. And then users will need to upgrade their frameworks versions to fix that. 

It creates a kind of chain of breakage that will reach out to the end users, and this will be very difficult for them to migrate. Frameworks will be mad with us, users will be mad with us. And most of the cases, if we enable the permission model by default, a lot of people will just pass disable permission because they don't care about that. We decide, "Okay, if people - they care about performance, they will opt in." It would be much easier if we rewrite Node.js from scratch. 

Let's say that we are releasing Node.js 2.0. Possibly, those features will be enabled by default, because we will teach people this is how this Node.js was designed. If you want to use Node.js, that's how you go." But since Node.js is still the same, people are just using nvm to upgrade, and they don't expect this new design, this new structure, they don't want to learn Node.js again. We can't do that. 

[0:24:04] JG: Yeah, it's also on - what's the latest major version at the time of recording? 24.10 or? We're not quite at the 2.0 level anymore. 

[0:24:13] RG: Yes, that's correct. I'm working on Node.js 25.0 that will go out in 2 days. 

[0:24:19] JG: Oh. Well, that's exciting. What are the major changes or features for 25? 

[0:24:23] RG: So one thing important about SemVer releases. A lot of people believe that SemVer major are the most exciting releases of packages of runtimes, platforms like Node.js. But turns out that it's very, very boring. SemVer minors goes to SemVer minor releases. Performance improvements, most of the time are categorized as SemVer minor or SemVer patch, which means that it's very likely that Node.js 24.10 will be way more exciting than 25.0. Because 25.0 only contains breaking changes. And for Node.js 25.0, we will have the upgrade version of V8 to the version 14.1, which will bring the major JSON.stringify performance improvement. 

If you look to the V8.dev, they have created a blog post about how they have made JSON. stringify fast. And with this version of Node.js, it will bring the V8 version that includes that performance improvement. We have a new built-in signed integer, 8 array. It's for a base 64 or exodimal conversion, WebAssembly and git pipeline optimizations. We are also enabling, adding a new feature to the permission model, which is the dash-dash allow net. Network will be restricted whenever you use the permission model on Node.js 25. 

But other than that, we don't have like a lot of features coming in. Most of them, they are breaking changes or considered breaking changes. If you are using the permission model Node.js 24, it's likely that you get affected by Node.js 25 with this new network restriction. 

We are unflagging the experimental web storage. Web storage will be enabled by default on Node.js. Ae are deprecating and removing a lot of APIs. APIs that was runtime deprecated, they were removed on Node.js 25. And some of them will be runtime deprecated on Node.js 25. If your console, whenever you run Node.js and some API is emitting a warning to you that this will be removed possibly, Node.js 25 will break you. Upgrade that. 

[0:26:39] JG: That is exciting, removal of dead code. Does that impact Node performance, download size, and so on at all? 

[0:26:44] RG: Yes. I mean, we'll you remove code that needs maintainers. For us, maintainers, it's very good to remove code. 

[0:26:51] JG: I want to take us a bit back to the second of two scenarios. We talked about a performance optimization attempt that ended up showing, well, not always optimal. Is there an upcoming one or recent change you've made that you're excited about for performance that did work out? 

[0:27:05] RG: There's one that the pull request is to open. I wrote that with Robert. We'll bring a significant performance improvement to the HTTP module of Node.js. We talked about that, and then I remembered. But it's a opt int feature. So we are releasing that to not break people. It's a new flag. Whenever you create an HTTP server, there's an optimize empty request option. 

Basically, if you are relying on the REST pattern, which is get request should not have a body, head request should not have a body. If you enable that option, your HTTP server will be faster, because we will clean up the string faster. We will dump the string. However, there are people that still read body from get and head requests. If we enable that by default, we will break those people. Okay? I believe that we will have that option. 

We will have some discussions to enable that by default, but this is a separate discussion. For now, let's bring that people, that option. Frameworks like Fastify or Express, they can enable that if they want to. So they will have this gate. If Express enabled that by default and people are not mad with it, it means that Node.js are likely to enable that by default. This is a request. 

[0:28:31] JG: Before you move on, just looking at pull 59778 optimize MP requests. You've got a screenshot in the body of the description showing requests per second, and it looks like you're going from, in one of the cases, 32,000 to 69,500. That seems like a significant increase in throughput.

[0:28:51] RG: Yes, that benchmark we have run for 30 seconds in a dedicated machine. Without this option enabled, we got 1,000 requests, let's say, or 32,000 requests per second in 1 percentile, to 69,000 in one percentile requests per second. So we are more than doubling down the performance of HTTP. But this is specific to the get and head requests. 

[0:29:20] JG: It's pretty incredible. You can see how the decision is not easy to make of whether and when you might want to make the breaking change and turn this on by default. If you're doubling the request per second, but also breaking so many people. Surely, that's got to be a long, slow deliberating process to turn on. 

[0:29:35] RG: That's correct. Yes. And I just saw that they need to rerun that CI and land that. This is one of the problems. We run so many CIs in different environments, because Node.js runs everywhere. We have Smart OS, we have macOS, we have Windows, different platforms. We have Linux, different platforms. And we need the CI to be green in all those. I believe that we have more than 60,000 of assertions on Node.js for tests. It takes a significant amount of time. To get a green CI, it usually gets, say, six hours. If you get a flaky test, you need to rerun that. So it takes a while. 

[0:30:20] JG: Well, okay. You were about to look up some more exciting improvements that you're looking forward to with the next versions of Node. 

[0:30:26] RG: There are some improvements that we did to the Node.js work, but we don't have it optimized any specific internal code of Node.js. It doesn't translate to an API where people will experience that in real world applications. There are improvements to the assert partial deep strict equal. There are improvements to the how Node.js is benchmarked. 

For instance, people always ask me, "If Node.js has so many benchmark files, we have benchmarks for every feature we release, why regressions still exist? Why you don't measure benchmark? Why you don't run benchmarks after every release or before every release of Node.js to find regressions?" The reason is that, nowadays, to run a full Node.js benchmark CI, it takes 84 hours. Just around that.

[0:31:26] JG: 84 hours. Wow. 

[0:31:27] RG: The reason for that is that, because measuring JavaScript code is tricky, we can't run a benchmark just once. We rely on an algorithm called new hypothesis, which tends to be the student test approach, which is we run each benchmark for each configuration 30 times before the change and 30 times after the change. 

Imagine the situation where, in our benchmarks, we have configurations. Let's say that we want to create a benchmark file for the O2 inspect, or console.log, better. We create a configuration where we call console.log with a long string, with a short string, with an integer, with a object, with other options, like a long object or a small object. All those are configurations. We run the benchmark 30 times for each configuration before the change. 30 times for integer, 30 times for short strings, 30 times for long strings or big strings. And then we run that 30 times again after the change. Then we produce that statistical analysis to prove the variance, or if the change, if the performance is statistically significant or not. 

Because sometimes, you run the benchmark before the change just once and run that after the change just once, and then you compare, "Okay, I got more requests per second, or I got more operations per second. So my code is faster." But that's not the reality. Your machine is doing a lot of things in background. If you are in a Zoom call when you are running a benchmark, it's very likely that your code will produce a lot of variance that will impact in the result. And then the operations per second you get in the end is not true. It doesn't show the reality. 

Brendan Gregg once did an experiment in 2008, I believe, where he shoot in the data center, and then he expressed that the slow disk I/O operations increased just because he went right in the frontend of the data center and shoot. This proved the variance. So to calculate, if your benchmark produce a statistical significance, which means that if the p value is greater than 0.05, it means your code they don't belong to the same group, which means your performance improvement is valid or your regression is valid. Otherwise, they are just in the same sense of the standard variation. That's why people normally plot data into a normal distribution so they measure that. And that's why it takes so long to run benchmarks. But we are improving that. We are creating more machines. We are sharding that, but it will take a while. 

[0:34:21] JG: That's an incredibly statistical mathematical way to look at this. That must have taken a long time to get to. 

[0:34:27] RG: I love that part, to be honest. Most of my research and my studies are about that. If you look to my blog posts, to my blog in general, I talk a lot about how to prove if your benchmark is valid or not. And how to not get tricky. So I wrote talks about that. There's a talk where I created with help of my team. It's called 'Lies, Damn Lies, and Benchmark', which basically shows you why most of the time your benchmark has been lying to you. 

And this also points to when I was talking about HTTP loading tools. I mentioned that there's no difference between Apache benchmark, or WRK, or Autocannon. But the reality is there is. There's a problem when you are using HTTP load frameworks called coordinate the mission. 

G10 once - there is an HTTP load tool called WRK, and G10 WRK created a second version, a fork of it, WRK2, which fixes that coordinate the mission. Coordinate the mission is simple if you try to explain. Imagine that you send a request to your server. Once the server replies back, it will sends another request, and then it measures the latency. But this doesn't show the reality. 

Once you send a request, you should send another request after X seconds. So it will show how real world applications behave. Then you get a long latency on the second request because the server is waiting for the second response. There's a long explanation on the WRK2 repository. I strongly suggest you to test it. And it's valid. That points a very good solution. So I normally use that for benchmarks in Node.js. 

[0:36:18] JG: Let's talk about your blog a little bit. You're the author of a series 'The State of Node.js Performance'. What is 'The State of Node.js Performance'? What does that blog post talk about? 

[0:36:27] RG: In 2023, I've been doing Node.js major release for quite a while. I believe since Node.js 18, or Node.js 17, I don't remember. I did all Node.js major releases. Node.js 17, 18, 20, 21, 22, 23, 24, and in two days, 25, hopefully. And I have always wondered, "Okay, we are releasing a new merger version of Node.js. But what this means to performance enthusiasts?" It means that if I migrate from Node.js 18 to Node.js 20, my code will run faster. A lot of people don't know that. And they create issues on Node.js core repo, "After migrating from Node.js 16 to Node.js 18, my code, my application is behaving worse."

And because benchmarks takes a while, we don't run that very often. And we don't have machines or capacity to run that. Node.js core developers, they had to investigate that. They need to replicate the environment of the issue. So some of them are only replicable on Windows, some of them are macOS. And it was very hard. So I decided, "Okay, after releasing Node.js 20, I want to know exactly which workload or which APIs will be affected either by regression or by performance improvement." We can't do that as part of Node.js release process because this takes a significant amount of time. And on Node.js 2023, I did using my own time. And 2024, I got sponsored by the company I'm working on, NodeSource. So they give me machines to test that, dedicated machines time, so I could do that. And I'm planning to do the 2025 version. 

On this report, I utilized three versions of Node.js. For Node.js 2024, I have used Node.js 20.17.0 and 22.9.0. And also comparing to the 18 version of Node.js. I split the benchmark set up using the Node.js internal benchmark suite, which takes 84 hours. A Node.js bench operations, which is a repository where I maintain, where I also monitor small operations like converting strings using parseInt versus plus signo. Is that faster on this specific environment? This repository where I store all those benchmarks. 

And specific APIs. For instance, HTTP server, Fastify and Express, it got worst or better with this new version? Because in all major version of Node.js, we upgrade the V8 version. By upgrading the V8 version, we never know exactly which APIs will be affected because it's JavaScript in the end. If they change something in the git compilation, this might affect the whole ecosystem, or this might affect just a specific operations on Node.js. 

For instance, the V8 update got the parsing performance upper that matches the plus signo. This report it's a more comprehensive study about how the Node.js benchmarks are evaluated, how we guarantee we are not measuring non-operations, and how it will affect you. 

For instance, by upgrading from Node.js 18 to Node.js 20, or better, from Node.js 16 to Node.js 20, if you were relying on Node.js crypto operations, your code will be very slow. Now, because of OpenSSL - then we track down all Node.js dependencies. Node.js depends on libuv, depends on OpenSSL, many other dependencies, just to know process.versions. You will see all the dependencies of Node.js. And it got worse because of OpenSSL. 

Now it's fixed because we have improved, we have upgraded Node.js to the OpenSSL version 5.2, if I'm not wrong. And this report contains all the information and all the resources you you need to know before upgrading or even to diagnose your Grafana dashboard. Because when people upgrade a Node.js version, they will see, "Okay, my app is now consuming 25% more memory but is using 20% less CPU. Is that better?" And then on this report, normally there's an explanation and why this is good or why this is bad. 

[0:41:04] JG: Do you find that people understand the metrics is put forth? Are there ways to kind of convey the nuances and complexities of them that the common Node user would understand? 

[0:41:14] RG: The reports, I normally use percentages. So it's easy for them to say, "Okay, this API is 15% faster or 15% slow." I always create two kind of sections. One for people that just want to get the value and one for people that wants to understand how the benchmark was executed. 

[0:41:36] JG: That makes sense. Before we move on to the last area of technical topics, I do want to ask you about your company, NodeSource. Your employer, they sponsor this work. What is it that NodeSource does? 

[0:41:45] RG: NodeSource, it's a consuting company, but they also create products. They have an APM which measure the performance, all the metrics of your Node.js application without sacrificing the performance. For instance, it's very weird for me as a performance enthusiast that if I want to deploy my application in production and I want to monitor that, I want to measure the performance of my app. This will reduce the performance of my app by 30%. It's weird. It's just bad. 

N|Solid, a product from NodeSource exists. It's an APM. Basically, we have forked Node.js. We are still rebasing on top of Node.js, but we have added or changes that makes APM fast. So now if you don't use APM or use N|solid, the performance will be the same. So we have no direct impact in your app. So you'll be able to see in the console CPU utilization, event loop utilization, memory utilization, and even creating CPU and heat profiles without losing performance. You can do all of that in production. That's why N|Solid does exist, and that's why I like it. 

[0:43:03] JG: Is this something that could be upstream to Node as an option? 

[0:43:06] RG: Yes, we do a lot of upstreams to Node.js, but we also understand that the process of moving things to Node.js, slight slower, because it's relies - it expects consensus from all members to make some changes. And some of them are too business oriented that doesn't go to the Node.js but go to a product like N|Solid. But yeah, we do a lot of abstract patches. 

[0:43:36] JG: Great. I love to hear that. It's good to see people in the community companies in the community that not just work around Node but on Node and in it itself to make a better product. 

[0:43:45] RG: Yeah. 

[0:43:46] JG: Let's talk at the end of this section about userland libraries. Let's say that I am writing an app and I'm using and N|Solid for logging. And APM, I'm using Fastify rather than, say, Express or core HTTP. What are some of the tips and tricks you'd use or you would give me to make sure my app is fast, let's say? 

[0:44:05] RG: It really depends on the SLO or on the expected requests per second you want. I know that a lot of people will say that I want my app to be the fast as possible. But some of them don't need to spend too much time on fine-tuning as other companies. I would say that if you choose a good ATP framework like Fastify and makes a good usage of Fastify. Because it's not just using Fastify, it's creating Fastify routes with the correct expected request objects and expected response objects so Fastify can optimize that. It's creating acute pattern of project. 

Most of the bottlenecks is in the business code, right? If you receive a request, you need to connect to the database. So make sure you have a pool of connections. Measure your code using trace-opt and trace-deopt to track, to check if you are making use of V8 optimizations. Some people, they don't understand how hiding classes on V8 classes makes your code faster. So they are creating an object and using the delete property. 

For instance, I have an object and I want to delete one property from this object. And they use the delete keyword for that. What they don't know is that whenever you use delete keyword, the hiding class will go away, and this object will be way slower than it should be. So instead of using delete, assign the value to undefined, so you don't delete the hiding classes. 

Creating objects using the correct or the expected object tree. So all of these, you can get information by using trace-deopt. If you get too many the optimizations during your code, something is wrong. Make use of a good logging library. Whenever you call an API too much, you need to make sure that this API is fast as possible. Pino JS is a good login library. 

When you create a database connection, make sure that inspect the network. Check if you are creating too many sockets. Check if the connection is been up in the pool of connections. So you don't need to create a new connection with database whenever you send a query for that. So there are more generic performance tips than Node.js tips, I would say. So measuring the errors of your machine. So check the events, check sys holes, check if you are using the correct version of Node.js, please. There is a package I wrote, it's called 'Is my node vulnerable'. So just run your terminal, "Npx is my node vulnerable? Is dash my dash?" Blah-blah-blah. And it will tell you if you are using a safe version of Node.js or not. 

I think upgrading the dependencies is very important. I always suggest to use less dependency as possible. For instance, if you use uchu.style text, this replaces some usage of chalk. Node.js has now TypeScript. Node.js has SQLite. We have a built-in modules that I strongly suggest you to use that, because we can make it faster. 

[0:47:21] JG: What are your thoughts on log files as applications and as servers? 

[0:47:25] RG: I don't work on products like business products for quite a long time. I have been working with opensource or libraries for the last five years. I don't have a strong opinion on that part. I know, as a maintainer, that we normally don't use the log files. People can upgrade that. We don't release log files because people can upgrade that. At least this is how we handle that on Fastify, and I believe Express as well. I don't have a strong opinion. 

[0:47:59] JG: What a rare and beautiful statement from someone who works deeply in tech, "I don't have a strong opinion." 

[0:48:05] RG: Yeah. 

[0:48:05] JG: All right, we're heading towards the end of the interview. Are there any topics you wanted to bring up when it comes to Node, or security, or performance? 

[0:48:13] RG: Well, if you are interested on Node.js core and you want to make your contribution, I've been doing live streams on Twitch, YouTube, and also X teaching people how to contribute to Node.js. I've been planning to create a live stream during the Node.js 25 release to show people how I release a major version of Node.js. And I normally announce the live streams on the Node.js mentoring channel either on Node.js OpenJS Foundation Slack or on the Node.js Discord server. 

[0:48:46] JG: Have you found that a lot of people have started contributing to Node through these live streams? 

[0:48:51] RG: Well, I know at least 10 people that got their first contribution to Node.js with some guidance from live stream. So, it's a very good number for me. 

[0:49:00] JG: In a sense, you've become a 10x developer. 

[0:49:04] RG: Yeah, kind of. 

[0:49:05] JG: Let's say that I'm somewhat interested. I've never contributed to a major open source project before. I would love to do it. I don't really know what that entails or what the benefits and drawbacks are. How would you pitch this to me? 

[0:49:16] RG: I would say that my career would be way boring if I didn't contribute to open source, because it opened so many challenges for me that I could learn a lot. I'm pretty sure that if I haven't started contributing to Node.js, I would have stopped my C++ studies. I would have stopped a lot of performance improvements, research I did. So this helps me to find good offers. 

I have received many, many offers because of my open source work. So I could travel through the world. I'm from Brazil. And for , e to go to the Europe is like 12 hours flight, and it's very far. I have been in China delivering talks. I have been in the whole Europe delivering talks. And all of these were not possible without my contributions to the open source. 

[0:50:11] JG: That's lovely. You become a member of a global community of people?

[0:50:15] RG: Yes. I mean, I have more people that I know by GitHub handles than in real life. 

[0:50:22] JG: That's lovely. To close out the interview, I like to ask something explicitly non techchrelated, and that's a great transition. Let's talk about friendships and sports. You used to play tennis. And you've recently switched what you've kindly been calling behind the scenes soccer and what others may call football. Can you tell us about that? 

[0:50:38] RG: I have recorded some fire chats at NodeConf, I don't know, two years ago, three years ago where I told people I have started playing tennis, because finding people, finding friends to play soccer is way difficult. Because for tennis, I just need to find one person. But for soccer, I need to find a whole team. Now, after three years, I could find a good amount of people. And now I'm playing more soccer than tennis. Yeah, I've been playing soccer, I don't know, three times per week as a good Brazilian. We love soccer, so we do that. It's in my blood. So, yeah. 

[0:51:17] JG: Do you root for any clubs or players still? 

[0:51:19] RG: I support Santos. It's the Neymar team in Brazil, but I mean, we are not good. But, I mean, we are not good. We are not good now. We are very good, to be honest. But we are not good now. I also like a lot Manchester United. But still, it's in the same situation as Santos. 

[0:51:39] JG: A person must have hope in this one. Well, great. Rafael, you've talked about a lot of incredibly exciting and interesting things. We talked about security in Node, talked about performance, how Node's internal performance works, is improved, and has been benchmarked and optimized. We talked about tips for using land libraries. And also, of course, your regular streams, where one can learn how to contribute themselves. If we wanted to learn more, if we wanted to reach out to you on the internet, where would you direct us? 

[0:52:05] RG: Join the Node.js mentoring channel. I'm mostly available on X and also in Slack. So if you want send me a message on Slack or Discord, I'm happy to guide you to your first contribution. Also, on X, my DMs are open, and I'm happy to help. 

[0:52:23] JG: For Software Engineering Daily, this has been Rafael Gonzaga and Josh Goldberg. Thank you for listening, everyone. Have a great day.

[END]