Photo by Michael Dziedzic on Unsplash
In our previous articles, we talked about how Node.js worked under the hood and how V8 compiled the code so efficiently, most of that efficiency is related to compiler optimizations, so in this article, we'll finally get to know what are those and how they work!
This is a brief summary of several compiler optimizations V8 might perform in the code. The whole point of this article is only to introduce what sort of things are included when we say "optimization". We won't be digging deeper into how compilers do this.
All of the optimizations below are done while the compiler is analyzing the code.
On Stack Replacement
On Stack Replacement is the optimization technique that replaces a chunk of not-optimized code by another chunk of optimised code during execution. V8 does that every time it needs to optimize a single function or the running code. In short, on stack replacement means that the current stack frame will be replaced by another stack frame of optimised code without losing any other information and while the code is still executing. It's like changing the tires of a car in the middle of a race without stopping.
Constant Folding
Replaces constant expressions by their final value at compile-time, rather than doing the calculation at run-time.
Example:
not compiled:
const j = 3 + 9
compiled:
const j = 12
Induction Variable Analysis
In a loop, if a variable is a simple linear function of the index variable, for instance, const p = 4 * i +1
then it can be updated appropriately each time the loop variable is changed.
This is what is called a strength reduction, a form of optimisation where costly operations are replaced by equivalent less costly ones, for instance, a costly multiplication is replaced by a series of cheaper additions.
Rematerialization
The act of recalculating a value instead of loading it from memory, which prevents memory access to be performed too many times.
Removing Recursion
Recursion is often very expensive, as we saw about stack overflows. Tail recursive algorithms (code that ends returning a call to itself) can be converted to iterative algorithms, which eliminates the stack problems. This is often done by using Tail Call Optimisations, which is the process where you are able to avoid allocation a new stack frame for a function because the calling function will simply return the value that it gets from the called function. So this last call can be replaced by the function itself.
Peephole Optimisations
These are usually performed late in the compilation process after the machine code has been generated. This optimisation technique examines a few adjacent instructions (like looking through a peephole) to see if they can be replaced by a single instruction or a shorter sequence of instructions. An example is a multiplication by a power of 2, which can be replaced by a bitwise left-shift. (which is also a strength reduction optimisation)
Inline Expansion
This is the technique of replacing the call to a function by its body. This saves the overhead of adding another stack frame and also adds a great opportunity for parameter specific optimisations, but this comes at the cost of space. If the procedure is called several times during a program, its body will be replaced several times, which can lead to a bigger, heavier code.
Generally, inlining is very useful to performance-critical code that makes a large number of calls to small procedures, so there are fewer jumps.
Inline Caching
Inline caching relies on the observation that repeated calls to the same method tend to occur on the same type of object. V8 maintains a cache of the type of objects that were passed as a parameter in recent method calls and uses this information to make an assumption about the type of object that'll be passed as a parameter in the future. If this assumption is good, then the next call can bypass the process of figuring out how to access the object's properties and, instead, use the stored information from precious lookups to the hidden class of that object.
This relates specifically to the concept of hidden classes because whenever a method is called on a specific object, the engine has to lookup the hidden class in order to find the memory offset for such called property. After two successful calls of that same method to the same hidden class, V8 omits the hidden class lookup and adds the offset to that property to the object pointer itself. This greatly increases execution speed.
Dead Code Elimination
This process eliminates code that is never called in the program. It does this, roughly, by passing through all bytecodes during program execution, generates a graph and eliminates those parts which do not belong to any code path.
Code Block Reordering
Code-block reordering alters the order of the basic blocks in a program in order to reduce conditional branches and improve locality of reference, which is the tendency of a processor to access the same set of memory locations repetitively over a short period of time.
Jump Threading
Consecutive conditional jumps predicated entirely or partially on the same condition can be merged. E.g: if (c) { foo; } if (c) { bar; }
becomes if (c) { foo; bar; }
Trampolines
Many CPUs have smaller subroutines call instructions in order to access low memory. The compiler can save space using these small calls in the function's body. Multiplying space savings from code refactoring.
Common subexpression elimination
Whenever we have repeated subexpressions, like in (a+b) * 2+(a+b)
, the common subexpression is a+b
. So, the compiler calculates the value of a+b
only once and uses constant folding to replace it in the expression call, assuming the common subexpression will not change.
Conclusion
You DID IT! You finally got to the end of our 10 part long series about Node.js under the hood! I hope you liked it and felt a little more excited to learn more!
Below I'll leave all the references I used to compose all those articles and also a link to the original article draft on my GitHub. That's it! Thanks a lot for reading and giving me feedback about it :D
References
- LibUV
- N-API
- Esprima AST generator
- TurboFan docs
- TurboFan JIT
- Native modules
- JS History
- Node.js history
- Element Kinds in V8
- WHATVG spec on microtasks
- V8 Under the Hood
- FS module source
- TurboFan creation motives and performance reasons
- FS read_file_context source
- V8 Under The Hood Examples
- Internals of Node with crypto library
- Microtasks and Macrotasks
- Lauching ignition and turbofan
- Performance Optimisations in V8
- In-Depth of Inline caching
- Sea of Nodes approach
- Sea of Nodes explanation
- How to get bytecode from NodeJS
- Understanding V8 Bytecodes
- V8 Bytecode List
- How Node's GC Works
- V8 Interpreter Generator
- What are Stacks?
- What are queues?
- Compiler Optimisation list
- What are Static Single Assignments?
- On stack replacement in V8
- Why is Node.js so Fast
- You don't know Node.js
- V8 - A tale of Turbofan
- Optimisation tricks in V8
- V8 Internals for Developers
- How V8 Optimizes the Code
- My personal notes (in Portuguese) about V8
- [BOOK] Node.js Under the Hood
- Tracing de-optimisations in Node.js
- Understanding Promises Once and for All
- JS Rendering Engine
- Memory Allocation in Javascript
- How JavaScript works: an overview of the engine, the runtime, and the call stack
- My talk guidelines (first version, incomplete, also in Portuguese) about this topic
- How JavaScript works: inside the V8 engine + 5 tips on how to write optimized code
- [VIDEO] High performance JS in V8
- [VIDEO] Ryan Dahl's Introduction to Node.js
- [VIDEO] BlinkOn 6 Day 1 Talk 2: Ignition - an interpreter for V8
- [VIDEO] MNUG 2017.03.23 TurboFan: A new code generation architecture for V8
- [VIDEO] Benedikt Meurer: A Tale of TurboFan: Four years that changed V8 forever
- [VIDEO] Marja Hölttä: Parsing JavaScript - better lazy than eager? | JSConf EU 2017
- [VIDEO] Franziska Hinkelmann: JavaScript engines - how do they even? | JSConf EU 2017
- [VIDEO] TDC 2017 - Stadium: How Node.js Works Internally by Kirmayr Tomaz (in Portuguese)