A few months ago I began working on one of our backend Node.js applications. The application in question was not as stable as we needed it to be, and as I began to talk with former colleagues at another company I found that this is a common experience with Node.js apps. I was not alone.
As it turned out, the core of the problem in this application was the lack of proper exception handling. Node.js has a rather idiosyncratic way of handling exception cases which can be unintuitive to many developers unfamiliar with the core Node APIs. It isn’t obvious on first glance when errors in a large, complex Node application aren’t being handled properly. I suspect that this may be the root cause of many stability problems in Node applications.
Everything is asynchronous
One main difference between Node and most other environments is that almost every operation is asynchronous by default. This is in stark contrast to Java, for example, which requires the programmer to implement a concurrency API for asynchronous operations. Node.js is asynchronous unless the programmer goes out of their way to make the code synchronous.
In most of the Node core APIs, try/catch cannot be used to properly handle exceptions. This is because errors arising in an asynchronous operation in JavaScript may not be scoped within the try/catch block where the operation was begun.
For example, say we want to open a file called “file.txt” and print its contents. To do this asynchronously, we call fs.open and provide the method with a callback function to execute upon its completion.
fs.open(“file.txt”, function (contents) {
console.log(contents);
});
So far so good. But what happens when an error occurs inside the callback? JavaScript includes the try/catch language feature familiar to users of Java which, on first glance, appears to be the way to go.
Unfortunately, this doesn’t work as expected. fs.open is an asynchronous function. Try/catch will only catch exceptions thrown synchronously. The I/O error here will not be handled.
How Node.js handles exceptions
When a function in Node is called, it is typically passed a callback argument which is executed when the operation completes. This callback pattern is ubiquitous across Node.js applications because it is harmonious with the event loop.
The most common way of asynchronously signaling an exception is to pass the error object in as the first argument of the callback. The core Node.js API is designed to follow this convention and most third-party libraries also follow it.
Following our example above, let’s implement error handling the asynchronous way. Now we can check if an error has been produced in the callback itself.
fs.open(“file.txt”, function (err, contents) {
if (err) {
console.error(“An error occurred!”, err);
} else {
console.log(contents);
}
});
This allows us to properly handle the error condition.
Unexpected exceptions
It is worth thinking about what we mean by exception. An exception may signal an undesired but predictable state in the program has been reached, such as an I/O error.
There are also unexpected exceptions – things that the programmer did not, and perhaps could not, anticipate. This category can include errors emitted by core libraries and third-party packages. We still want to be able to handle these cases, but we don’t know when or how they will arise.
This second category – unexpected exceptions – is the critical piece in Node.
Domains
The Node.js API offers a solution to unexpected exceptions in the form of “domains.” A domain is essentially a closure with an error handler that can wrap sections of code. It provides a means of catching any and all exceptions occurring in the enclosed code. Domains essentially serve as an async-friendly version of doing try/catch.
Learning the hard way
The application I worked on had as its central component an Express HTTP server. I added some middleware to attach a domain to the request, which I thought would resolve the issue. This worked well for errors emitted within the life cycle of the request. Unfortunately, the main issue affecting stability in this application ended up being another component that ran outside of the Express HTTP server – a service that talks to Redis. Creating another domain wrapping that component brought it under control.
Below is a simple example of how to use domains to handle errors in such an application where Express runs alongside another component.
In this example, we initialize an Express server and attach a domain through the middleware. We then create another domain for the non-Express component and initialize it through the domain’s run
method and we design our non-Express component to initialize in a start
method. We have a separate error handler component called errHandler
that encapsulates all of our error handling logic.
var express = require('express'),
domain = require('domain'),
errHandler = require('./errHandler'),
componentOutsideExpress = require('./component');
var dom = domain.create(),
server = express();
server.use(errHandler.handleExpressError);
dom.on('error', errHandler.handleError);
dom.run(componentOutsideExpress.start);
It’s not done for you
It is important to realize that in any Node.js application handling of uncaught errors is not done automatically. There is no global domain set by default. If an error is emitted somewhere in the guts of your application and there is no error handler above it somewhere in the call stack, that error can bring down a Node process and may have dire consequences for application stability.
To evaluate your Node.js application’s error handling, I suggest carefully diagramming its component architecture. Make sure you aren’t running any code outside of a domain scope, and don’t make the assumption that exceptions will not arise in any given component. Ideally, handling of unexpected errors should be a part of your design process when you first build a Node.js application. By consistently using domains we can achieve greater application stability.
Originally published March 23, 2015