Why is node_modules heavier than the universe? (No seriously, why is it so big?)

Faiz Bachoo Shah - Oct 28 '23 - - Dev Community

The problem

Have you ever faced the issue of waiting what seems to be eternity while you are downloading dependencies after hitting that yarn (or npm install for you masochists out there) in a typical node project? Of course you have!! Whom are we kidding? :)

But why does it happen? 🤔 Well, one of the reasons is obviously that package managers like npm has several inefficiencies in downloading node modules. That could be still solved by using a better package manager like pnpm. But reason is that JS dependencies are usually larger in size than their corresponding dependencies.

Why is it that JS dependencies are usually larger in size than their corresponding dependencies in other languages such as Go, Rust and Java, etc.?

The reason

The reason for this is simple - Dependencies in languages such as Golang are source code + binaries, while dependencies in JS are JS files + CSS Assets + HTML assets and stuffs. And the size of a JS file is usually larger than a binary executable, and thus a typical JS dependency is usually larger than its correspondents. But the question arises, why are JS dependencies not binaries? To answer that, first we need to understand how does a code gets compiled in JS and some other language like Go.

So in a language like Go, the output of your entire project is compiled into a binary executable with the help of Go compiler. This binary executable can then be shared to anyone else who can then execute it in their machine. When you install a golang dependency, it does also install the source code too, but due to the rich nature of Golang due to which it has support for a lot of stuffs, a golang package usually has less external dependencies, and even those dependencies are statically linked to the golang package's binary. Due to this reason, even after installing the source code, Golang packages have small size.

But what is the output of a JS project? Its usually a single JS file called bundle.js or index.js, which contains the entire JS source code of the project in a highly condensed, minified manner, alongside probably bundled with some css and html assets. This is usually achieved using tools like Webpack.

Now, to execute the JS in this minimized bundle file, we use a JS Engine like the V8 Engine which is found in Chromium and Node.js, and it executes the JS in a compilation technique called Just-In-Time (JIT) Compilation.

JIT Compilation technique in which the compiler compiles the source code at the time of execution only, i.e. run-time, instead of ahead of time as we see in other languages. It means that during the time of execution, the compiler interprets a line, and compiles it and executes it in that moment only. Most modern JS engines optimise this process further by using techniques such as Ahead-Of-Time (AOT) Compilation, etc.

Now, JIT is not a technique which is just used in JS. Even Java uses it to execute the generated Byte Code in the machine, but the problem is JS does not have any intermediate form like Java's byte code. Hence, JS Engines directly needs the source JS code to do the JIT compilation.

Now since the JS Engines compiles JS source code right at the time of execution, and the only way it can compile is if the source code is written in JS, thus the dependencies also needs to be written in JS instead of being binary executables. Because, if they were executables, they would not be able to be handled by the JS Engines, because as we just read, JS Engines directly needs the JS source code only, and most modern engines are made to handle only JS files and nothing else.

Thus, it is due to this reason that the modules in node_modules needs to be outputted as JS files, leading to increase in their size and making them bulky.

Extra Buzz

It is still possible to execute binaries made in other languages such as C/C++ by linking it dynamically as Native Addon Modules in Node.js using tools like node-gyp. But the execution in those case is done by the operating system, not the JS Engine.

The JS ecosystem is developing fast⚡️, so there is a possibility that one day we might be able to solve this bulky modules issue, but till then we can do our parts as developers to contribute to the ecosystem as much as we can!!

Happy Engineering!!😎👍🏻

.