Getting the Most Out of Your Node.js Errors

How many times have you found yourself viewing a stack trace in your terminal or inside your monitoring systems and been unable to understand anything from it? If the answer is ‘a lot’, then this blog post is for you. If you do not suffer from this problem often, you might still find this interesting.

When dealing with complex flows that occur from a Node.js server, the ability to get the most out of the errors that it can return to the requesting party is vital. The problem starts when a single error that is created during the handling of a request causes the creation of an additional error somewhere higher up in the chain. When this scenario happens, once you generate a new error and return it in the chain, you lose all relation to the previous, original error.

At Codefresh, we spent a lot of time trying to find best patterns for dealing with these scenarios. What we really want is the ability to create an error that can be chained to a previous error with the ability to aggregate information from this chain. We also want the interface for doing this to be very simple, but extendable for future enhancements.

We searched for existing modules that could support our needs. The only module we found that answered some of our requirements was WError.
‘WError’ provides you with the ability to wrap existing errors with new ones. The interface is very cool and simple, so we gave it a go. After a period of intensive use, we came to the conclusion that it was not enough:

The stack trace of an error would not go over the whole chain, but rather would only show the stack trace of the higher error that was generated.
It lacked the ability to easily create your own types of errors.
Extending the errors with additional behavior would require extending their code.

Introducing CFError

With our extensive experience we assembled an error module that answers all our requirements. You can find all the information and documentation here: http://codefresh-io.github.io/cf-errors.

Let’s see how you can use CFError with a real example using Express. We will create an Express app that will handle a single request to a specific route. This request will handle a query for getting a single user from a Mongo database. We will define a single route and an additional function that will be in charge of actually retrieving the user from the db.

var CFError    = require('cf-errors');
var Errors     = CFError.Errors;
var Q          = require('q');
var express    = require('express');

var UserNotFoundError = {
    name: "UserNotFoundError"
};

var app = express();

app.get('/user/:id', function (request, response, next) {
    var userId = request.params.id;
    if (userId !== "coolId") {
        return next(new CFError(Errors.Http.BadRequest, {
            message: "Id must be coolId.",
            internalCode: 04001,
            recognized: true
        }));
    }

    findUserById(userId)
        .done((user) => {
            response.send(user);
        }, (err) => {
            if (err.name === UserNotFoundError.name) {
                next(new CFError(Errors.Http.NotFound, {
                    internalCode: 04041,
                    cause: err,
                    message: `User ${userId} could not be found`,
                    recognized: true
                }));
            }
            else {
                next(new CFError(Errors.Http.InternalServer, {
                    internalCode: 05001,
                    cause: err
                }));
            }
        });
});

var findUserById = function (userId) {
    return User.findOne({_id: userId})
        .exec((user) => {
            if (user) {
                return user;
            }
            else {
                return Q.reject(new CFError(UserNotFoundError, `Failed to retrieve user: ${userId}`));
            }
        })
};

A few things to notice:

When creating an error, you have the ability to provide predefined http errors that you can then extend.
You can add a ’cause’ property when creating an error that will chain a previous error to the new one. When printing the stack of an error you will receive the full stack trace of the whole chain printed in a readable manner.
You can add any additional fields you want to the error object. We will explain the use of ‘internalCode’ and ‘recognized’ later.
You have the ability to define your error objects outside of your code and then just reference them when creating an error.

Let’s go ahead and add an error middleware to our Express app.

app.use(function (err, request, response, next) {
    var error;
    if (!(err instanceof CFError)){
        error = new CFError(Errors.Http.InternalServer, {
            cause: err
        }); 
    }
    else {
        if (!err.statusCode){
            error = new CFError(Errors.Http.InternalServer, {
                cause: err
            });
        }
        else {
            error = err;
        }
    }
    
    console.error(error.stack);
    return response.status(error.statusCode).send(error.message);
});

A few things to notice:

We make sure that the final error that is printed to the log and returned to the user is always a ‘CFError’ object. This will allow you to add additional logic to the error middleware.
All predefined http errors have a built in ‘statusCode’ property and a ‘message’ property already populated for your use.
Extending your errors will allow you to have all the error handling logic inside one place. You will not need to worry about having to print every error object when it is created, rather you print the stack trace only once and get the whole execution flow and context.

Let’s now change the way we return errors to our clients and return an object instead of just the top level error message.

return response.status(error.statusCode).send({
    message: error.message,
    statusCode: error.statusCode,
    internalCode: error.internalCode
});

Great! We now have a unified process of returning errors to our clients.

Reporting errors to monitoring systems

At Codefresh, we use New Relic as our APM monitoring system. We noticed that the errors we generated and reported to New Relic could be categorized into two groups: the first consisted of all errors that were generated because of thrown and unexpected behavior of our servers. The second (business exceptions) consisted of all errors that were generated as part of good analysis and a correct handling of our servers. Reporting the second type of errors to New Relic made our Apdex score decrease in unpredictable ways which would result in false positive alarms that we received from our alerting systems.

So we came up with a new convention. Whenever we conclude that a generated error is a result of correct behavior of our system, we construct an error and attach an additional field named ‘recognized’ to it. We wanted the ability to put the ‘recognized’ field on a specific error in the chain, but still be able to get its value even if higher errors did not contain this field. We exposed a function on the CFError object named ‘getFirstValue’ which will retrieve the first value it encounters in the whole chain. Let’s see how we use this in Codefresh.

app.use(function (err, request, response, next) {
    var error;
    if (!(err instanceof CFError)){
        error = new CFError(Errors.Http.InternalServer, {
            cause: err
        });
    }
    else {
        if (!err.statusCode){
            error = new CFError(Errors.Http.InternalServer, {
                cause: err
            });
        }
        else {
            error = err;
        }
    }

    if (!error.getFirstValue('recognized')){
        nr.noticeError(error); //report to monitoring systems (newrelic in our case)
    }

    console.error(error.stack);
    return response.status(error.statusCode).send({
        message: error.message,
        statusCode: error.statusCode,
        internalCode: error.internalCode
    });
});

A few things to notice:

Because we already know we are only dealing with CFError objects, we only had to add two lines of code to support this.
Since we are explicitly deciding which errors we actually want to send, if you are using New Relic you will need to manually disable the automatic sending of all errors. Currently, in order to achieve this you will need to manually add all http errors status codes to the ‘ignore_status_codes’ field inside the ‘newrelic.js’ config file. We have already opened a ticket for the New Relic support team to provide an easier way to do this.
```
exports.config = {
  error_collector: {
    ignore_status_codes: [400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 421, 422, 423, 426, 428, 429, 431, 451, 500, 501, 502, 503, 504, 505, 506, 507, 508, 510, 511]
  }
};
```

In conclusion

Getting the best out of your errors requires not only a good error module, but also well-defined processes of when, where and how you do it. You will need to follow your defined patterns, otherwise it tends to get messy.
Reporting only real errors to a monitoring system is vital for your ability as a company to detect and solve problems after they have occurred.

CFError github page: https://github.com/codefresh-io/cf-errors
CFError npm page: https://www.npmjs.com/package/cf-errors

2 thoughts on “Getting the Most Out of Your Node.js Errors”

Nadav Ivry says:

August 31, 2016 at 3:05 pm

I implemented this module today in my Express app and it’s great. thanks!

One thing I’m still thinking about is the best way to send Raygun (which is the service I’m using for error reporting) all the data of each error.

Currently, I’m sending the error object to Raygun, but it seems that Raygun doesn’t know how to process the object, and as a result, it displays the error details poorly compared to the regular reporting approach provided by Raygun official module.

1. Itai Gendler says:
  
  September 1, 2016 at 9:12 pm
  
  Thanks for you comments Nadav. I have evaluated Raygun in the past and faced the same problem you are facing right now.
  Raygun node.js lib parses the error stack in a specific way. They take each line of the stack and push it to an array using a custom object. CFError object stack field may contain multiple stacks (using the cause notion) which will not be shown nicely because Raygun can’t handle the ‘Cause by’ string which will eventually result in just one huge stack without any possibility to understand when a specific part of the stack starts and ends. Raygun does not allow passing just a string, they expect a specific object which at the beginning seemed to be very nice but turned out as unextendable. This was one of the reasons why we didn’t continue using it. Regarding the other issue you mentioned, passing additional information to Raygun is possible using a second parameter, so you can just create your own logic and append every field you would like, this will allow you to provide any extra context on the error.