Node.js Performance Monitoring - Part 1: The Metrics to Monitor
When dealing with performance in Node.js, there are several metrics that can be vitally important when digging deep into how your Node.js apps are performing and how you can improve that performance.
It can be hard to figure out which metrics are important when you're new to Node.js and are really trying to squeze every ounce of perf out of it. There's literally thousands of ways to get metrics out of Node.js for you to explore, but which core metrics can really help?
In this post I'll discuss three Node.js metrics that are extremely helpful to start out with when beginning to analyze performance.
CPU usage in Node.js
Node.js applications don’t typically consume an extraordinary amount of CPU time. High CPU usage is an indicator that your app is doing a lot of synchronous work. However, this can also block the event loop, which in turn means the asynchronous work that Node.js does will also be blocked.
While high CPU usage isn’t necessarily bad, if you’re managing a web server and you know you’re going to have a CPU-intensive task, that task should be spun out to another process, as this could otherwise cause your service to be unavailable or sluggish, impacting end users.
Given how key asynchronous operations are to success with Node.js, digging down into apps that are hogging CPU - and resolving the operations that are causing - it is a good first step in understanding performance for Node.js applications.
Heap Usage, Memory Leaks, and Garbage Collection in Node.js
Node.js has a unique restriction around memory - a hard cap of 1.5GB maximum heap for a single process, regardless of how much memory is available on the machine running the process. Keeping this in mind when architecting and testing your application is vitally important.
Memory leaks are a common issue in Node.js, and are caused when objects are referenced for too long--in other words, when a variable is stored even though it is no longer needed. Normally, garbage collection frees up unused memory making it available for your application to use again. However, the garbage collector cannot free up memory used by these variables that have hung around long past their expiration date. If your application memory usage is growing steadily and not periodically being reduced by garbage collection, you may well have a memory leak which should be addressed.
In a perfect world, you’d focus on preventing memory leaks rather than diagnosing and debugging them. Once a leak is present in your application, it can be extremely difficult to track down the root cause. You’ll need to take heap snapshots of your application over time and inspect them to really dig into the memory usage of your Node.js application.
Lag in the Node.js Event Loop
One of the core strengths of Node.js is that it’s fast. It’s been built to process multiple events quickly and asynchronously. This strength comes from the event loop, which allows applications to respond to these events rapidly.
Understanding when and why the event loop is slowing down is important when optimizing an application for speed. As each cycle of the event loop slows down, each event will take longer to process and act on. Functionally, this can slow Node.js down to the point of unresponsiveness.
Some common causes of event loop lag include:
Long running, synchronous tasks
Spending too much time during a single tick of the event loop can also be the source of performance issues. You can't elminate the CPU-bound work your server does, but we do need to be mindful of how long we're spending at any given time. If the work takes longer than our acceptable response time, it might make sense to perform that work in a different process.
Constant increase in the tasks per loop
Node.js keeps track of all the functions and callbacks that need to be handled at the various stages of the event loop. As your server sees an increase in load, the number of tasks per loop starts to increase. Your users will start to see an increase in response times when this count gets too high. The good news is scaling up the number of processes running your application can often alleviate this, and return your website performance back to normal levels.
Just one more thing...
We've built out a ton of tooling around performance monitoring in production for Node.js apps with N|Solid, including the metrics in this article. If you'd like to start mointoring Node.js in production with tooling built exclusively for Node.js, give N|Solid a shot.
If you want to stay in the (event) loop with the tools, tutorials, tips, and more around Node.js performance, be sure to follow @NodeSource on Twitter and keep an eye on the NodeSource Blog to keep up to date.