Memory leaks in Node.js???
In my early career, I spent a lot of years writing code in C and C++. Memory management in those languages was a real art, and disasters like memory leaks, dangling pointers, and segmentation faults were no strangers to my life. Then, at some point, the world, along with my career, all moved to memory-managed languages like Java, .NET, Python, and of course - the inevitable JavaScript. At first, coming from C/C++, the concept of automatic memory management and garbage collection seemed too good to be true - can I really stop worrying about memory leaks?? I'll take two of those, please.
But as is often the case in life, if something is too good to be true - it might indeed not be (completely) true. Automatic memory management is great, but it's not a foolproof silver bullet, and memory leaks are still lurking out there even when you write code in languages that possess this trait - like JavaScript. This means that for us, the Node.js developers, there are still concerns to be aware of regarding memory leaks.
Let's dive into memory leaks in Node.js and see how they can occur, how to identify them, and, of course, some tips on how to avoid them.
How do memory leaks occur?
Memory leaks are caused when the Garbage Collector on Node.js does not release blocks of memory that aren't being utilized. Ultimately, this causes the application's overall memory utilization to increase monotonically, even without any demanding workload, which can significantly degrade the application's performance in the long run.
And, to make things worse, these memory blocks can grow in size, causing your app to run out of memory, which eventually causes your application to crash.
Therefore, it's essential to understand what memory leaks are and how they can occur in Node.js apps so that you can troubleshoot such issues quickly and fix them before a user experiences a problem in your app.
How does Garbage Collection happen in Node.js?
Before diving in any further, it's essential to understand the process of Garbage Collection in Node.js. This is crucial when troubleshooting memory leaks in Node.js.
Node.js uses Chrome's V8 runtime to run its JavaScript code. All JavaScript code processed in the V8 runtime is processed in the memory in two main places:
- Stack: The stack holds static data, method and function frames, primitive values, and pointers to stored objects. As usual with Stacks (and in particular call stacks), they get pushed and popped in a LIFO order, and popping from the stack automatically frees the relevant stack memory. Nothing for us to worry about :)
- Heap: The heap keeps the objects referenced in the stack's pointers. Since everything in JavaScript is an object, all dynamic data, like arrays, closures, sets, and all of your class instances, are stored in the heap. As a result, the heap becomes the biggest block of memory used in your Node.js app, and it’s where Garbage Collection (GC) will ultimately happen.
Why is Garbage Collection Expensive in Node.js?
Node.js needs to periodically run its garbage collector process, which is basically code that needs to run and map the heap objects to identify unreachable objects (unreferenced). As the heap (and the reference tree) grows, this becomes an expensive computational task.
Since JavaScript is single-threaded, this will interrupt the application flow until garbage collection is completed. That is the main reason why the GC process runs infrequently.
What causes a memory leak in Node.js?
With this information, it's safe to assume that most memory leaks in Node.js will happen when expensive objects are stored in the heap but aren't used. So, ultimately, memory leaks are caused by the coding habits that you adopt and the overall understanding that you have of the workings of Node.js
Let's look at four common cases of memory leaks in Node.js so we know what patterns we want to avoid (or minimize).
Memory Leak 01 - Use of Global Variables
Global variables are a red flag in Node.js. It heavily contributes to memory leaks in your app if it's not handled correctly. For those of you who don't know what it is, a global variable is a variable that's referenced by the root node. It’s the equivalent of the Window Object for JavaScript running in the browser.
So, these global variables never cease to be referenced. Therefore, the garbage collector will never clean them up throughout your app lifecycle. Your global variables will continue allocating memory in the app during its execution. Therefore, if you're managing highly complex data structures or nested object hierarchies in the root of your app, your app is at a high chance of being impacted by memory leaks.
For example, if you're working with dynamic data structures, as shown below, your app will likely have memory leaks:
// Global variable holding a large array
global.myArray = [];
function addDataToGlobalArray(data) {
// Push data into the global array
global.myArray.push(data);
}
// Function to remove data from the global array
function removeDataFromGlobalArray() {
// Pop data from the global array
global.myArray.pop();
}
// Function to do some processing with the global array
function processData() {
// Use the global array for some computation
console.log(`Processing data with ${global.myArray.length} elements.`);
}
// Call functions to add and process data
addDataToGlobalArray("Item 1");
processData();
// Call functions to add and remove data
addDataToGlobalArray("Item 2");
removeDataFromGlobalArray();
// Call processData again
processData();
// The global.myArray variable is still in memory, even though it's no longer needed.
Memory Leak 02 - Use of Multiple References
The next issue is something that we have all done at some point. It's when you use multiple references that point to one object in the heap. Such issues are often developer faults where they reference various variables to the same object.
Therefore, if you deallocate one variable, the heap won't clear it as more variables point to the same reference. For example, the code shown below is a classic scenario in which you're bound to run into memory leaks:
// Define two objects with circular references
const obj1 = { name: "Object 1" };
const obj2 = { name: "Object 2" };
// Create circular references between obj1 and obj2
obj1.reference = obj2;
obj2.reference = obj1;
By doing so, both obj1
and obj2
will never be cleaned up by the garbage collector as each object is pointing to the other.
Memory Leak 03 - Use of Closures
Closures memorize their surrounding context. When a closure holds a reference to a large object in the heap, it keeps the object in memory as long as the closure is in use. For example, consider the snippet below:
function createClosure() {
const data = "I'm a variable captured in a closure";
// Return a function that captures the 'data' variable
return function() {
console.log (data);
};
}
// Create a closure by calling createClosure
const closure = createClosure();
// The closure still references 'data' from its outer scope
// Even though 'createClosure' has finished executing
closure();
// The 'data' variable is not eligible for garbage collection
As shown above, all the variables defined inside createClosure()
are being used by the function that is returned from createClosure()
. And since JavaScript refers to the lexical scope when getting references to the variables it has used, data will never be collected by the garbage collector. If you manage more complex or dynamic data inside a closure, this pattern is prone to memory leaks.
Memory Leak 04 - Unmanaged use of Timers and Intervals
If you're using setTimeout
or setInterval
with Node.js, you should know they are a very common source of memory leaks. Node.js will keep referencing the function Object passed to setTimer
or setInterval
as long as they are not stopped. If you do not store the returned id
from setTimer
and setInterval
in order to call clearTimeout
/ clearInterval
, those function Objects will stay referenced and won't get garbage collected. If, on top of that, you don't wisely manage the variables you create inside your function Object, you are prone to memory leaks.
Consider this snippet:
function thisWillLeak() {
let numbers = [];
return function() {
numbers.push(Math.random());
}
}
setInterval(thisWillLeak(), 2000);
In this example, the numbers
array will keep growing in memory forever and will not get garbage collected since the Interval is never cleared. You should make sure to store the returned timeoutId
/intervalId
in a variable and to make sure to clear them as soon as they are no longer used:
thisWillNoLongerLeak = setInterval(thisWillLeak(), 2000);
// .... do some things with this Interval
clearInterval(thisWillNoLongerLeak);
How can I identify a memory leak in Node.js?
The snippets I provided in this article might make it seem like memory leaks are pretty easy to diagnose. But your codebase is not as simple as these examples and will have a much higher count of lines of code. Therefore, if you wish to find memory leaks by reviewing your codebase, you'll have to go through an irrational number of lines of code in your app to find issues related to global scopes, closures, or any of the other points I've covered.
Therefore, relying on tools specializing in debugging memory leaks in Node.js apps is best. Here are a few tools to help you detect memory leaks.
Tool 01 - node-inspector
Figure: Node Inspector
node-inspector (GitHub | NPM) lets you connect to a running app by running the node-debug
command. This command will load Node Inspector in your default browser. Node Inspector supports Heap Profiling and can be useful for debugging memory leak issues.
Tool 02 - Chrome DevTools
Figure: Chrome DevTools
The next option is to use a tool already built into your browse - Chrome DevTools.
Chrome DevTools lets you analyze the application memory in real-time and troubleshoot potential memory leaks.
Figure: A sample DevTool inspection
Wrapping up
In order to make sure your services are robust and won't crash, it's essential to look closely into your codebase and identify potential patterns that might cause memory leaks. If they remain untreated, your app's memory footprint will monotonically increase as the app grows, which could drastically impact app performance for your end users.
So, do take note of the areas I mentioned above - Closures, Global Variables, Multiple/Circular References, Timeouts, and Intervals as these are the key areas that can cause memory leaks in your app.
I hope that you will find this article helpful on your journey to make your services robust.
If you are indeed all about making your Node.js microservices robust and coded to the highest standards, there is one more tool that can help you with that... 😉:
How can Amplication Help?
Amplication lets you auto-generate Node.js code for your microservices, enabling you to build high-quality apps with high-quality code that take extra precautions for the issues discussed above to ensure that your app will not cause any memory leaks (well, at least not in the boilerplate code we generate. The rest... is up to you 😊).