The lifecycle of browser code
When would be a good time to run bytecode? As Mathias Bynens (developer advocate for Google’s V8) points out in this episode, bytecode is great for running one-off code. Sequences like Initialization and setup are great for this step because you can generate the bytecode, run it once, and never have to deal with it again. But what if you do have to keep using this code?
This is where the optimization step comes in. As the interpreter is translating the AST into bytecode, it’s keeping track of all of the various idiosyncrasies of your code. Are you calling the same function over and over again with numbers? Code like this is called hot code, and these types of scenarios make for a good opportunity for the engine to translate that code even further into highly-optimized machine code. This machine code can be quickly accessed and executed across multiple instruction sets. This final step is run by the engine’s compiler to send this machine code to be run directly on the CPU.
One final thing to note: all code has to be run as machine code eventually. So even though the interpreter runs bytecode while the compiler creates optimized machine code for the CPU, the interpreter’s bytecode still gets translated into machine code. The difference is that this interpreted machine code is not controlled by your engine. Bytecode on the virtual machine will be run as it sees fit, whereas optimized compiler machine code is inspected very carefully to only run the exact instruction sets required for the CPU. Hence why this optional 3rd step exists: if we see patterns for optimization, we want to control how this machine code is run. If it doesn’t require optimization, then we’re happy to let our machine code build as it feels necessary.
Differences between engines
Considerations across devices and environments
The biggest reason against forking is maintainability. While it would be great to create application-specific engines that are highly optimized for specific machine interactions, the difficulty becomes maintaining an ever-increasing and sparse distribution of engines that all conform to the same Ecma 262 standard. Intuitively, it would seem much easier to just update V8 when ES6 added arrow functions than to have to update V8 for Chrome, V8 for Node, and so on.
Before we get started, I want to mention that we are looking for writers. We are making a big push towards written content on the site. So you can apply at softwareengineeringdaily.com/ jobs. We’re also looking for podcasters potentially. We have really high standards for podcasters, but we also have a job posting there. So please do apply.
[00:01:51] JM: Your audience is most likely global. Your customers are everywhere. They’re in different countries speaking different languages. For your product or service to reach these new markets, you’ll need a reliable solution to localize your digital content quickly. Transifex is a SaaS based localization and translation platform that easily integrates with your Agile development process.
Your software, your websites, your games, apps, video subtitles and more can all be translated with Transifex. You can use Transifex with in-house translation teams, language service providers. You can even crowd source your translations. If you’re a developer who is ready to reach a global audience, check out Transifex. You can visit transifex.com/sedaily and sign up for a free 15-day trial.
With Transifex, source content and translations are automatically synced to a global content repository that’s accessible at any time. Translators work on live content within the development cycle, eliminating the need for freezes or batched translations. Whether you are translating a website, a game, a mobile app or even video subtitles, Transifex gives developers the powerful tools needed to manage the software localization process.
Sign up for a free 15 day trial and support Software Engineering Daily by going to transifex.com/ sedaily. That’s transifex.com/sedaily.
[00:03:39] JM: Mathias Bynens, you are a software engineer at Google working on developer advocacy on the V8 team. Welcome to Software Engineering Daily.
[00:03:48] MB: Yeah, thanks for having me. It’s great to be here.
But that’s not the end of the story, because sometimes code gets repeated quite often or we see that the same code paths get hit over and over and over again, and that point it might make sense to start optimizing those specific functions, right?
On an average webpage, for example, you will have lots of code that is used to initialize the page, and you only really want to run that once. In that case, it makes sense to run it in the interpreter. We produce bytecode, which can happen pretty quickly. We can produce bytecode quite fast, and then we only have to run it once. So we just run it in the interpreter, and that’s it. We’re done with it. That’s the main benefit of the interpreter. It can actually get you running quite quickly. It is very quick at producing some runnable code.
For some code, that’s where it ends. It never leaves the interpreter. It only runs once and that’s fine. For those specific functions that we detect are hot, we take that bytecode that we have from the interpreter, and together with this profiling data that we collected while we were running the code, like we saw, “Oh! This function keeps getting called with two number arguments.” So we’re going to optimize specifically for that case. Then the optimized compiler produces a highly optimized machine code.
But once we have it, we can run that optimized machine code very efficiently. You can see there’s a big of a tradeoff between you have to be very careful at selecting what code and what functions you optimize, because if you optimize run functions or if you try to optimize everything, then nothing will be fast, because it takes a long time to produce that code.
[00:10:33] JM: This is ahead of time compilation versus just in time compilation.
[00:10:38] MB: Yeah, pretty much. Yeah. Yeah, with the interpreter, you produce a bytecode right away as soon as you get to the abstract syntax tree. Then the optimizing compiler, that’s what it’s called, just in time compilation, so JIT.
[00:10:50] JM: Can you zoom in – Because something that’s been confusing me for a wile is can you zoom in on the – When the interpreter is going to run bytecode, what’s going on there?
So I understand that in the path where the code is hot and it goes through the optimizing compiler, and the optimizing compiler takes the bytecode and goes in and turns it into a machine code. The interpreter, doesn’t that also have to turn bytecode into machine code, but I guess it’s just less optimized?
[00:11:57] JM: Yeah.
[00:11:58] MB: Does that make sense?
[00:12:00] JM: It does. I guess the thing that I have had – And maybe I’m just not well-educated on this, and maybe you don’t know in detail what goes on here, but I’m really curious about the transition from that bytecode to machine code. What kind of application is responsible for doing that? For making that execution happen? What does that kind of look like? What does that process look like?
[00:12:23] MB: Okay. The interpreter in V8 is called Ignition, and that’s what produces the bytecode, and the bytecode then runs in this interpreter. The optimizing compiler, I haven’t mentioned. The name is turbofan, for what we have in V8. That’s the name for optimizing compiler.
So the transition goes – Wow! This is difficult to explain without being able to visualize it, I guess.
[00:12:45] JM: We can move beyond it. I’m probably nit-picking details. I guess if people want to know the process of bytecode turning into a machine code, they can look it up on the internet.
But talking more abstractly, I think there are a lot of people out there who have heard this term
[00:14:00] JM: There are different virtual machines for different browsers. So Google has a
But running the codes takes a little bit longer. It’s not super-efficient at running the code.
But for optimizing compilers, they produce this highly optimized machine code. But if you could the number of instructions, you’ll see that it’s a lot larger, like a factor of, let’s say, 8, compared to the bytecode. So memory-wise, the memory footprint of this optimized machine code is a lot larger than it is for bytecode that you can just run in the interpreter. That’s another tradeoff. Running code in the interpreter is more memory efficient, but producing this optimized machine code requires more memory.
[00:16:23] JM: What are some of those examples? What would be something that maybe Google would subjectively say, “Oh, we want to optimize for this kind of use case –” Maybe, I don’t know, ads or something, or something related to search. We want to be compiling these kinds of hot code paths more aggressively than – Apple might say, “Oh, actually, we don’t care about those as much, and maybe we want to compile something related to privacy or something more aggressive like that,” or is it not that abstract? What level are these subjective decisions taking place? Does it have a user experience component?
In fact, last year, we made Node.js a first class citizen in our testing infrastructure alongside
Chrome. Node is very important for us, and we do everything we can to support the Node project when it comes to what we change in V8. For example, we count [00:17:38] new changes in V8. If doing so, would break a Node.js test. We run the entire Node.js test suite for every commit that we land. If it breaks anything, then we can’t even land it.
Of course, that also determines kind of the architectural decisions that we make when we design things or we change components in V8, because we don’t just want to support the web. We also want to support all these other embedders, including Node.js. As you can imagine, the code that one can write for a website or for a long-running Node.js server can be quite different and can have very different characteristics.
Of course, the more optimization tiers you add to your codebase, that’s a tradeoff you have to make in terms of code complexity, and maintenance, and the maintenance cost goes up, of course, the more code that you produce and the more code that you add. There’s also some benefits, because you have more find-grained control over how much time you want to spend generating optimized code and how optimized should this code be.
[00:19:27] JM: For all of the advances in data science and machine learning over the last few years, most teams are still stuck trying to deploy their machine learning models manually. That is tedious. It’s resource-intensive and it often ends in various forms of failure. The Algorithmia AI layer deploys your models automatically in minutes, empowering data scientists and machine learning engineers to productionize their work with ease. Algorithmia’s AI tooling optimizes hardware usage and GPU acceleration and works with all popular languages and frameworks.
Deploy ML models the smart way and hardware, and head to algorithmia.com to get started and upload your pre-trained models. If you use the code SWEDaily, they will give you 50,000 credits free, which is pretty sweet. That’s code SWEDaily.
The expert engineers at Algorithmia are always available to help your team successfully deploy your models to production with the AI layer tooling, and you can also listen to a couple of episodes I’ve done with the CEO of Algorithmia. If you want to hear more about their pretty sweet set of software platforms, go to algorithmia.com and use the code SWEDaily to try it out or, of course, listen to those episodes.
[00:21:18] MB: Yeah. So if you look at V8 from 10 years ago, when the project was first opensourced, it had a completely different pipeline than what we’re looking at today. In fact, we only just launched Ignition and TurboFan, our new interpreter, our new optimizer compiler just last year in Chrome 59 I believe it was. Of course, when we were working on this new pipeline, Node was already a thing. We already knew we wanted to support Node more actively. Yeah, that went into the design of Ignition and TurboFan.
[00:21:47] JM: Okay. Well, let’s a little bit more about the specific optimizations that a
This means that even if you have a thousand or a million of these objects with the same X and Y shape, we don’t have to store that shape a million times. We can only store it once, and all these objects then points to that same shape. Hopefully you can see that this actually saves us
a lot of memory, because we only have to store this once. That’s the first optimization that shapes enable.
[00:24:59] JM: So how might those shapes vary in different situations and how might that lead to some savings? Could you rephrase, emphasize why that would lead to savings?
[00:25:32] JM: Sure. Yeah, please.
[00:25:33] MB: There are some visualizations that might make this easier to understand. Basically, let’s say you have a function that loads a property off of an object. Maybe the function is called get X, and all it does is it takes an object as its argument and then it returns object .x, right?
[00:27:29] JM: Are there are some development or improvements underway right now that are being worked on around the area of code shape and inline caching?
In V8, we have some special mechanisms in place, which have also been in place for a long time. We saw some of the fundamentals in the V8 runtime. We call them elements kinds. It’s kind of a weird name. Basically, whenever you create an array in your codebase, V8 keeps track of the kinds of values that the array contains and it tries to give the array a label, or a tag if you will, that says, “Okay, this array contains only integers,” for example, “or this array contains doubles,” which is a more generic collection.
Another thing would be if the array contains not just numbers, but other values as well, like objects or maybe undefined, or strings, then we would call those just regular elements. We have all these different elements kinds available, and whenever people perform any kind of operation on an array, it could be looping on the array with four each, or it could be using map, or reduce, or filter, or any of that stuff. Now, V8 can look at the elements kinds of that particular array and it can use that to optimize specifically for those elements kinds.
That’s why if you have an array that only consists of numbers that behind scenes we know are doubles, or even better, if behind the scenes we know that they’re all small integers, or SMI’s as we call them, then we can produce optimized machine code specifically for that case, specifically knowing that we only have to deal with small integers in that case.
[00:31:06] MB: I can tell you that when it comes to – You want to talk about garbage collection?
[00:31:09] JM: We could talk about garbage collection. I think I’m talking about garbage collection. I’m also talking about just how the quantity of memory, whatever memory is – Because I feel like my browser – So my browser, I feel like if I open tabs, if I open enough tabs, a Gmail, and Google Docs, and Zencaster, these things that consume a lot of memory. At a certain point, my browser reaches a point where it starts to slowdown and my overall system usage, maybe like 40% or 50%, which is high. If I look at the memory pressure, then I start to see like 40%, 50% and things starts to feel slow on my computer.
But it never gets to the point where my browser is eating up so much memory that my computer slows to a crawl. I guess it was a too big of a question. But I’m really trying to understand how memory is managed both within a browser and between a browser and an operating system.
Even if you build some kind of web app experience that has good performance, every now and then you could get unlucky when the garbage collector kicks in at the wrong time, and that would cause your frame rate to drop and your user to have a not so smooth experience anymore.
One big thing that happened there over the last couple of years, at least in V8, is that the memory team, who can answer these questions in a lot more detail than I could, they have been working on making our garbage collector almost entirely concurrent. Most of the garbage collection that is happening not does not block the main thread anymore in V8 and in Chrome, which means that there’s less jank. Even when garbage collection kicks in, you won’t even notice, because it’s happening on a separate draft and it doesn’t lock up your browser. It doesn’t lock up your web application. It doesn’t decrease your frame rate.
[00:34:04] JM: Is that the same for Node and for the browser?
[00:36:24] JM: DigitalOcean is a reliable, easy to use cloud provider. I’ve used DigitalOcean for years whenever I want to get an application off the ground quickly, and I’ve always loved the focus on user experience, the great documentation and the simple user interface. More and more people are finding out about DigitalOcean and realizing that DigitalOcean is perfect for their application workloads.
This year, DigitalOcean is making that even easier with new node types. A $15 flexible droplet that can mix and match different configurations of CPU and RAM to get the perfect amount of resources for your application. There are also CPU optimized droplets, perfect for highly active frontend servers or CICD workloads, and running on the cloud can get expensive, which is why DigitalOcean makes it easy to choose the right size instance. The prices on standard instances have gone down too. You can check out all their new deals by going to do.co/sedaily, and as a bonus to our listeners, you will get $100 in credit to use over 60 days. That’s a lot of money to experiment with. You can make a hundred dollars go pretty far on DigitalOcean. You can use the credit for hosting, or infrastructure, and that includes load balancers, object storage. DigitalOcean Spaces is a great new product that provides object storage, of course, computation.
Get your free $100 credit at do.co/sedaily, and thanks to DigitalOcean for being a sponsor. The cofounder of DigitalOcean, Moisey Uretsky, was one of the first people I interviewed, and his interview was really inspirational for me. So I’ve always thought of DigitalOcean as a pretty inspirational company. So thank you, DigitalOcean.
[00:38:31] JM: There’s also the concern of the de-optimization process. If you have hot code that you no longer need, I guess this is one form of garbage collection, so I’m not sure if you’re super familiar with it. But if something changes, like if a type of argument changes to the function and you need to de-optimize a piece of hot code, then you might need to throw a piece of code out and replace it with a new version. Tell me a little bit more about the de-optimization process? How is code evaluated for whether or not it is still hot.
[00:39:07] MB: Okay. I can talk a little bit about that. It’s actually separate from garbage collection. Although, conceptually, I can see how you can think of it that way. That’s an interesting topic. Okay, to go back, we have some code. It runs in the interpreter and we collect some profiling data while it’s running in the bytecode and the interpreter, right?
Then imagine, after the code gets hot, you call the same function, but now with different argument types. Like you said, maybe you’re not calling them with numbers, but now suddenly by accident you pass in a string, or two strings. In that case, the optimized machine code includes a check when we generate it. It includes a check to see if our assumptions that we made, if they’re still correct. In this case, it would include a check, “Are the two arguments still numbers?” In that case, “Okay, run this highly optimized code that we put together.” The else branch for that check says, “Okay, if not, if it’s not two numbers, then I don’t know what to do with it. I don’t have optimized code for this yet.” In that case, you have to de-optimize.
Now, when we de-optimize, we get back to the interpreter. Let’s say after that, you keep calling the same function, now with two string arguments over and over and over again. So we already had optimized codes for the case where we have two numbers. But if the function becomes hot again, if we keep getting same types as arguments, then eventually the optimizer compiler will kick in again and say, “Okay, not I know about these two cases. So I will just optimize and create some code for a case where we have two strings,” and it will add that to the optimized machine code that was there before.
Instead of an if, it basically becomes a switch statement, where it checks, “Okay, are the two arguments numbers like I saw in the first case? In that case, run this optimized code. If not, then check if the two arguments are two strings, in which case I have brand new optimized machine code. Just use that instead. If it’s none of these things, then we have to de-optimize and start to whole thing over again.”
Google called Project Zero, and they found a bunch of these vulnerabilities, and they actually fixed the spec as well. There were some spec issues that they’ve found to help avoid – To reduce the chances of this happening again in the future. They made the spec more robust.
[00:47:35] JM: You mentioned that bugs can be found in the spec. For example, a security issue was found in the spec, how does a spec get evaluated for flaws? Because you can’t actually run the spec code, right? How do you evaluate a spec for security issues?
[00:47:54] MB: Right. That’s the thing. It is a very dry technical document, but in the end it’s still humans that are interpreting it and they read between the lines sometimes. Sometimes people have different interpretations of the same piece of text.
So by making the spec more robust by adding things like, “Oh, yeah. Assert that this value is smaller than this value.” Small things like that can actually make a big difference when you’re writing to spec texts.
[00:49:58] JM: Yeah. It’s interesting you say that. That last conversation with the JVM guy about GraalVM. One of the things he touched on in that show was how fast Java has gotten. I remember even when I was in college, people talked about Java as if this language is so much slower than C, or C++. I don’t know about the speed differences today between C++ and Java.
[00:50:52] MB: Mm-hmm.
[00:50:53] JM: Cool. Well, Mathias, thank you for coming on Software Engineering Daily. It’s been really great talking to you.
[00:50:57] MB: Yeah, thanks for having me. I had a great time.
[END OF INTERVIEW]
[00:51:02] JM: Nobody becomes a developer to solve bugs. We like to develop software because we like to be creative. We like to build new things, but debugging is an unavoidable part of most developers’ lives. So you might as well do it as best as you can. You might as well debug as efficiently as you can. Now you can drastically cut the time that it takes you to debug.
Rookout rapid production debugging allows developers to track down issues in production without any additional coding. Any redeployment, you don’t have to restart your app. Classic debuggers can be difficult to set up, and with the debugger, you often aren’t testing the code in a production environment. You’re testing it on your own machine or in a staging server.
Rookout lets you debug issues as they are occurring in production. Rookout is modern debugging. You can insert Rookout non-breaking breakpoints to immediately collect any piece of data from your live code and pipeline it anywhere. Even if you never thought about it before or you didn’t create instrumentation to collect it, you can insert these nonbreaking breakpoints on the fly.
Go to rook out.com/sedaily to start a free trial and see how Rookout works. See how much debugging time you can save with this futuristic debugging tool. Rookout integrates with modern tools like Slack, Datadog, Sentry and New Relic.
Try the debugger of the future, try Rookout at @rookout.com/sedaily. That’s R-O-O-K-O-UT.com/sedaily. Thanks to Rookout for being a new sponsor of Software Engineering Daily.