How many times have you found yourself viewing a stack trace in your terminal or inside your monitoring systems and been unable to understand anything from it? If the answer is ‘a lot’, then this blog post is for you. If you do not suffer from this problem often, you might still find this interesting.
When dealing with complex flows that occur from a Node.js server, the ability to get the most out of the errors that it can return to the requesting party is vital. The problem starts when a single error that is created during the handling of a request causes the creation of an additional error somewhere higher up in the chain. When this scenario happens, once you generate a new error and return it in the chain, you lose all relation to the previous, original error.
At Codefresh, we spent a lot of time trying to find best patterns for dealing with these scenarios. What we really want is the ability to create an error that can be chained to a previous error with the ability to aggregate information from this chain. We also want the interface for doing this to be very simple, but extendable for future enhancements.
We searched for existing modules that could support our needs. The only module we found that answered some of our requirements was WError.
‘WError’ provides you with the ability to wrap existing errors with new ones. The interface is very cool and simple, so we gave it a go. After a period of intensive use, we came to the conclusion that it was not enough:
- The stack trace of an error would not go over the whole chain, but rather would only show the stack trace of the higher error that was generated.
- It lacked the ability to easily create your own types of errors.
- Extending the errors with additional behavior would require extending their code.
Introducing CFError
With our extensive experience we assembled an error module that answers all our requirements. You can find all the information and documentation here: http://codefresh-io.github.io/cf-errors.
Let’s see how you can use CFError with a real example using Express. We will create an Express app that will handle a single request to a specific route. This request will handle a query for getting a single user from a Mongo database. We will define a single route and an additional function that will be in charge of actually retrieving the user from the db.
var CFError = require('cf-errors'); var Errors = CFError.Errors; var Q = require('q'); var express = require('express'); var UserNotFoundError = { name: "UserNotFoundError" }; var app = express(); app.get('/user/:id', function (request, response, next) { var userId = request.params.id; if (userId !== "coolId") { return next(new CFError(Errors.Http.BadRequest, { message: "Id must be coolId.", internalCode: 04001, recognized: true })); } findUserById(userId) .done((user) => { response.send(user); }, (err) => { if (err.name === UserNotFoundError.name) { next(new CFError(Errors.Http.NotFound, { internalCode: 04041, cause: err, message: `User ${userId} could not be found`, recognized: true })); } else { next(new CFError(Errors.Http.InternalServer, { internalCode: 05001, cause: err })); } }); }); var findUserById = function (userId) { return User.findOne({_id: userId}) .exec((user) => { if (user) { return user; } else { return Q.reject(new CFError(UserNotFoundError, `Failed to retrieve user: ${userId}`)); } }) };
A few things to notice:
- When creating an error, you have the ability to provide predefined http errors that you can then extend.
- You can add a ’cause’ property when creating an error that will chain a previous error to the new one. When printing the stack of an error you will receive the full stack trace of the whole chain printed in a readable manner.
- You can add any additional fields you want to the error object. We will explain the use of ‘internalCode’ and ‘recognized’ later.
- You have the ability to define your error objects outside of your code and then just reference them when creating an error.
Let’s go ahead and add an error middleware to our Express app.
app.use(function (err, request, response, next) { var error; if (!(err instanceof CFError)){ error = new CFError(Errors.Http.InternalServer, { cause: err }); } else { if (!err.statusCode){ error = new CFError(Errors.Http.InternalServer, { cause: err }); } else { error = err; } } console.error(error.stack); return response.status(error.statusCode).send(error.message); });
A few things to notice:
- We make sure that the final error that is printed to the log and returned to the user is always a ‘CFError’ object. This will allow you to add additional logic to the error middleware.
- All predefined http errors have a built in ‘statusCode’ property and a ‘message’ property already populated for your use.
- Extending your errors will allow you to have all the error handling logic inside one place. You will not need to worry about having to print every error object when it is created, rather you print the stack trace only once and get the whole execution flow and context.
Let’s now change the way we return errors to our clients and return an object instead of just the top level error message.
return response.status(error.statusCode).send({ message: error.message, statusCode: error.statusCode, internalCode: error.internalCode });
Great! We now have a unified process of returning errors to our clients.
Reporting errors to monitoring systems
At Codefresh, we use New Relic as our APM monitoring system. We noticed that the errors we generated and reported to New Relic could be categorized into two groups: the first consisted of all errors that were generated because of thrown and unexpected behavior of our servers. The second (business exceptions) consisted of all errors that were generated as part of good analysis and a correct handling of our servers. Reporting the second type of errors to New Relic made our Apdex score decrease in unpredictable ways which would result in false positive alarms that we received from our alerting systems.
So we came up with a new convention. Whenever we conclude that a generated error is a result of correct behavior of our system, we construct an error and attach an additional field named ‘recognized’ to it. We wanted the ability to put the ‘recognized’ field on a specific error in the chain, but still be able to get its value even if higher errors did not contain this field. We exposed a function on the CFError object named ‘getFirstValue’ which will retrieve the first value it encounters in the whole chain. Let’s see how we use this in Codefresh.
app.use(function (err, request, response, next) { var error; if (!(err instanceof CFError)){ error = new CFError(Errors.Http.InternalServer, { cause: err }); } else { if (!err.statusCode){ error = new CFError(Errors.Http.InternalServer, { cause: err }); } else { error = err; } } if (!error.getFirstValue('recognized')){ nr.noticeError(error); //report to monitoring systems (newrelic in our case) } console.error(error.stack); return response.status(error.statusCode).send({ message: error.message, statusCode: error.statusCode, internalCode: error.internalCode }); });
A few things to notice:
- Because we already know we are only dealing with CFError objects, we only had to add two lines of code to support this.
- Since we are explicitly deciding which errors we actually want to send, if you are using New Relic you will need to manually disable the automatic sending of all errors. Currently, in order to achieve this you will need to manually add all http errors status codes to the ‘ignore_status_codes’ field inside the ‘newrelic.js’ config file. We have already opened a ticket for the New Relic support team to provide an easier way to do this.
exports.config = { error_collector: { ignore_status_codes: [400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 421, 422, 423, 426, 428, 429, 431, 451, 500, 501, 502, 503, 504, 505, 506, 507, 508, 510, 511] } };
In conclusion
Getting the best out of your errors requires not only a good error module, but also well-defined processes of when, where and how you do it. You will need to follow your defined patterns, otherwise it tends to get messy.
Reporting only real errors to a monitoring system is vital for your ability as a company to detect and solve problems after they have occurred.
CFError github page: https://github.com/codefresh-io/cf-errors
CFError npm page: https://www.npmjs.com/package/cf-errors