Node.js Under The Hood #4 - Let's Talk About V8

Lucas Santos - Oct 17 '19 - - Dev Community

In our previous article we discussed the last bit of JavaScript and JavaScript engines.

Now we've hit the bottom of Node.js, this is where things get messy and complex. We started talking about Javascript, which is the higher level concept we have, then we got into a few concepts like: call stack, event loop, heap, queues and so on...

The thing is: none of this stuff is actually implemented in JS, this is all part of the engine. So JavaScript is basically a dynamically-typed interpreted language, everything we run in JavaScript is passed on to the engine, which interacts with its environment and generates the bytecode needed for the machine to run our program.

And this engine is called V8.

What is V8

V8 is Google's open source high-performance JavaScript and WebAssembly engine. It's written in C++ and used both in Chrome or Chrome-like environments, and Node.js. V8 has the full implementation for ECMAScript as well as WebAssembly. But it does not depend on a browser, in fact, V8 can be run standalone and be embedded into any C++ application.

Overview

V8 was firstly designed to increase JavaScript execution performance inside web browsers - that is why Chrome had a huge difference in speed compared to other browsers back in the day. In order to achieve this increased performance, V8 does something different than just interpret JavaScript code, it translates this code into a more efficient machine code. It compiles JS into machine code at run time by implementing what is called a JIT (Just In Time) compiler.

As of now, most engines actually works the same way, the biggest difference between V8 and the others is that it does not produce any intermediate code at all. It runs your code the first time using a first non-optimised compiler called Ignition, it compiles the code straight to how it should be read, then, after a few runs, another compiler (the JIT compiler) receives a lot of information on how your code actually behave in most cases and recompiles the code so it's optimised to how it's running at that time. This is basically what means to "JIT compile" some code. Different from other languages like C++ which uses AoT (ahead of time) compilation, which means that we first compile, generate an executable, and then you run it. There's no compile task in node.

V8 also uses a lot of different threads to make itself faster:

  • The main thread is the one that fetches, compiles and executes JS code
  • Another thread is used for optimisation compiling so the main thread continues the execution while the former is optimising the running code
  • A third thread is used only for profilling, which tells the runtime which methods need optimisation
  • A few other threads to handle garbage collection

Abstract Syntax Trees

The first step in all compiling pipelines of almost every language out there is to generate what is called an AST (Abstract Syntax Tree). An abstract syntax tree is a tree representation of the syntactic structure of a given source code in an abstract form, which means that it could, in theory, be translated to any other language. Each node of the tree denotes a language construct which occurs in the source code.

Let's recap our code:

const fs = require('fs')
const path = require('path')
const filePath = path.resolve(`../myDir/myFile.md`)

// Parses the buffer into a string
function callback (data) {
  return data.toString()
}

// Transforms the function into a promise
const readFileAsync = (filePath) => {
  return new Promise((resolve, reject) => {
    fs.readFile(filePath, (err, data) => {
      if (err) return reject(err)
      return resolve(callback(data))
    })
  })
}

(function start () {
  readFileAsync(filePath)
    .then()
    .catch(console.error)
})()
Enter fullscreen mode Exit fullscreen mode

This is an example AST (part of it) from our readFile code in JSON format generated by a tool called esprima:

{
  "type": "Program", // The type of our AST
  "body": [ // The body of our program, an index per line
      {
          "type": "VariableDeclaration", // We start with a variable declaration
          "declarations": [
              {
                  "type": "VariableDeclarator",
                  "id": {
                      "type": "Identifier", // This variable is an identifier
                      "name": "fs" // called 'fs'
                  },
                  "init": { // We equal this variable to something
                      "type": "CallExpression", // This something is a call expression to a function
                      "callee": {
                          "type": "Identifier", // Which is an identifier
                          "name": "require" // called 'require'
                      },
                      "arguments": [ // And we pass some arguments to this function
                          {
                              "type": "Literal", // The first one of them is a literal type (a string, number or so...)
                              "value": "fs", // with the value: 'fs'
                              "raw": "'fs'"
                          }
                      ]
                  }
              }
          ],
          "kind": "const" // Lastly, we declare that our VariableDeclaration is of type const
      }
  ]
}
Enter fullscreen mode Exit fullscreen mode

So as we can see in the JSON we have an opening key called type, which denotes that our code is a Program, and we have its body. The body key is an array of object on which every index represents a single line of code. The first line of code we have is const fs = require('fs') so that's the first index of our array. In this first object we have a type key denoting that what we're doing is a variable declaration, and the declarations (since we can do const a,b = 2, the declarations key is an array, one for each variable) for this specific variable fs. We have a type called VariableDeclarator which identifies that we're declaring a new identifier called fs.

After that we are initialising our variable, that's the init key, which denotes everything from the = sign onwards. The init key is another object defining that we're calling a function named require and passing a literal parameter of value fs. So basically, this whole JSON defines a single line of our code.

AST's are the base for every compiler because it allows the compiler to transform a higher level representation (the code) into a lower level representation (a tree), striping all useless information that we put into our code, like comments. In addition to that, ASTs allow us, mere programmers, to fiddle with our code, this is basically what intellisense or any other code helper does: it analyses the AST and, based on what you've written so far, it suggests more code which can come after that. ASTs can also be used to replace or change code on the fly, for instance, we can replace every instance of let with const only by looking into the kind keys inside VariableDeclaration.

If ASTs make us able to identify performance stuff and analyse our code, it does the same to compilers. This is what a compiler is all about, analysing, optimising and generating code which can be run by a machine.

Conclusion

This is the beginning of our talks about V8 and how it works! We'll be talking about bytecodes and a lot of other cool stuff! So stay tuned for the next chapters :D

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .