Enterprise-grade Node.js Promises with Async and Bluebird
This blog post is about lessons learned at XO Group when implementing promises at an enterprise level, including the benefits and drawbacks of current implementations of Promises available for Node.
First of all, apologies to all readers that this is not a short read. But it will help educate and inform.
The concept of Promise
is nothing new in the programming world. Other languages such as C# has a Task
equivalent and Java has a Fork
/Join
equivalent. With all new concepts introduced to a language or framework are issues surrounding best practices or lack there of. The JavaScript community is different in that the language is expressive but lacks guidelines for best practices. They left the imagination to the end users to decide for themselves. But what if finding that light is not so obvious because of the inherent echo chamber which is our search engines always bubbling up the same results due to high click result ranking for a few pages that tends to bubble up to the top.
In my opinion, the use and best practices surrounding Promise
in JavaScript is a casualty of war which resulted in this feature inheriting a bad rap. I want to expose the beauty of Promise as a alternate control flow when proper best practices are applied thorougly.
When I look at language/framework features I’m interested in, there are several items on my checklist to gauge if it is a good candidate for incorporation to my/our workflow.
- Maintainable
- Is it easy to refactor?
- Obey some SOLID principles.
- Can I find and define logic routes easily?
- For debugging
- For extension
- How do we normalize our code base so these feature read and behave the same throughout?
- Well Defined Structure
- Can I read it easily and create a mental story
- Without pulling my hair out
- And be able to keep in context while looking at other stuff
- Catching Errors
- How do we catch one?
- How granular are they?
- How can we act on them?
- What is the unhappy path behavior?
- How does it recover?
- Scalable
- What would this look like if I had to apply it to..
- One other code base
- 100 other code base
- What would education be like for my fellow engineers if this were adopted
- Performant
- Does this run fast?
- Does it run fast enough for me/us?
- Does this make development cycle faster?
- Does it make onboarding faster?
Why Consider Promises?
Promises provide a control flow mechanism allowing reading comprehension to be a first class citizen. The default Node style of using callback to structure code often leads to the right growing pyramid of death.
function doSomething(param, cb) {
request.get('http://xyz.com/endpoint' + param, function(response, error) {
cb(response, error);
// This can keep growing out as you need more chaining involved.
});
}
Promises are adaptable to regular Node callbacks for existing libraries and standalone callback functions.
var someModule = require('some-module');
//Promise adaptor
var someModulePromisified = function(param) {
return new Promise((resolve, reject) => {
someModule(param, (result, error) => {
if (error) { reject(error); }
else { resolve(result); }
});
});
};
//Using the Promise adaptor
someModulePromisified(1).then((result) => { ... });
Promises allows for easy handling of Composition of Functions or unfurling thereof.
f o g
or
f(g(x))
Normal structuring of the above in regular control flow when chained.
doSomething2(doSomething1(param));
Turns into
doSomething1(param).then(doSomething2);
Common Examples and What Not To Do
You’ll often time see Promises used in such manner.
doSomething()
.then(function () {
return request.get('http://xyz.com/endpoint');
})
.then(function (response) {
return response.status === 200 ? 'AWESOME' : 'FOOBAR'
})
.then(function (mapped) {
if (mapped === 'FOOBAR') {
throw new Error('unexpected status');
}
return mapped;
})
.catch(function (error) {
console.error(error);
});
Does a function return a promise?
Let’s talk about what is wrong with the above. How do you know first off that doSomething()
returns a Promise
object for you to chain off of? You do not, it is a guessing game at best when there is no documentation. There was a phase when Promise
was trending and many authors created packages that did return it. But without reading through the code and looking at tests, you just aren’t sure. Some package authors provide the dual feature of Promise
object returns when callback
’s weren’t provided in the parameter list.
Thenable chaining with anonymous functions, how do I keep all that context in my mind?
The example above is relatively short. In a real use case, each of the thenable blocks will most likely contain LOC of 10 or more lines. So with several thenable blocks chained together, you quickly come to a point of have a huge page of spaghetti code which leads to quicker mental exhaustion while evaluating.
What about incorporating this?
Within a thenable block, how do you use this
? What does this
inherit context from anyways?
A general catch
is cool, but what if I needed to…
Do something specific for a single thenable block like console.warn()
because it was just a validation error and doesn’t need to ripple out a server error. Or emit a reply with different http.statusCode
based on different Error
constructors?
How can we unit test this?
Since each of the thenable are composed into the overall Promise
chain. The example above forces you to create e2e test. Due to the composition of the structure, changing a single then
block could ultimately effect the overall assertions of your test.
Lets refactor the above to something more readable and maintainable.
File: src/index.js
const Promise = require('bluebird');
const helper = require('./helper');
//setup for the this context within the promise chain
const context = {
options : {
url : 'http://xyz.com/endpoint'
}
};
//root promise chain
Promise
.resolve()
.bind(context)
.then(helper.getFromXYZ)
.then(helper.mapResult)
.then(helper.validateResult)
.catch(ValidationError, (error) => {
console.warn('validation missed', error.msg);
return this.mappedResult;
});
.catch(Error, (error) => {
console.error(error);
});
Let’s walk through the above and talk about what is new and what it is used for. There are a few changes peppered in there.
What is Bluebird?
const Promise = require('bluebird')
is a Promise
engine substitution. By overwriting the Promise
variable at the global level, it is being monkey patched. Bluebird provides significant performance improvements over native ES6 Promise. Bluebird also contains a superset API that overlays the A+/Promise specification. Some of the APIs that do not exist in the native Promise
implementation but do in Bluebird include: bind
, all
and catch([Function], error)
. I use these regularly.
Binding a context
.bind(context)
helps with setting up the this
context within the thenable chain of your Promise
calls. Setting it up allows for a known state so each of the functions (helper.getFromXYZ
, helper.mapResult
and helper.validateResult
) can process and test for an expected state. this
can now also be used to save content from the runtime context for a single invocation to the Promise
chain which guards against leaks of state(s) from one call to another. Another benefit is sharing of data through the whole composition of functions. Lastly, this allows for all thenable functions to push and pull data into a single object which enables the removal of parameter arity for said functions.
thenable are now readable as a story
Your named functions now compose itself as a readable story. Isn’t it nice not having to read through request.get(...) in order to understand that it access data from another REST endpoint? Or that right after you get the results, without reading through if statements the function is just returning some mapped results? This structure helps remove mental fatigue as you are piecing together the big picture without having to dig into each part.
.then(helper.getFromXYZ)
.then(helper.mapResult)
.then(helper.validateResult)
Multiple catch
Each of the functions can optionally throw unique Error
types to allow for controlled error evaluation. I can’t stress how important this piece is. You are now able to fine tune exactly what happens for any negative behaviors of a processing chain. As much as we love happy paths, much of the work we do day to day involves putting in guards on edge cases as they come into play.
Code splitting for maintainability
Each of the thenable body are now ported to separate modules for 3 reasons.
- Separation of concerns
- Making code into smaller units so it is less frightening to change
- Making each function standalone testable
- Allows for easier extensibility and substitution for a thenable part
Below contains what each of the code splitted thenable functions would look like as stand alone self contained exports. The corresponding test for each also shows how one would test each function in isolation without the composition of the overall root Promise chain.
File: src/helper/getFromXYZ.js
const getFromXYZ = function(){
return Promise
.resolve()
//this was bound from the root promise chain.
//because we are creating a new Promise chain, it needs to be rebound.
.bind(this)
.then(function() {
return request.get(this.options.url);
})
.then(function(response) {
this.resultFromXYZ = response;
});
};
module.exports = getFromXYZ
File: test/helper/getFromXYZ.mocha.js
const getFromXYZ = require('../src/helper').getFromXYZ;
it('should respond with good option', function() {
return Promise
.resolve()
.bind({
option: {
url: 'http://xyz.com/endpoint'
}
})
.then(getFromXYZ)
.then(() => {
this.resultFromXYZ.should.be.instanceof(Object);
this.resultFromXYX.statusCode.should.equal(200);
//more test
});
});
* File: test/helper/src/helper/mapResult.js*
const mapResult = function(){
this.mappedResult = return this.resultFromXYZ.status === 200 ? 'AWESOME' : 'FOOBAR'
};
module.exports = mapResult
File: test/helper/mapResult.mocha.js
const mapResult = require('../src/helper').mapResult;
it('should create mapResult when the request is valid', function() {
return Promise
.resolve()
.bind({
resultFromXYZ : {
status : 200
}
})
.then(mapResult)
.then(() => {
this.mappedResult.should.exist();
this.mappedResult.should.equal('AWESOME');
//more test
});
});
it('should create mapResult when the request is invalid', function() {
return Promise
.resolve()
.bind({
resultFromXYZ : {
status : 404
}
})
.then(mapResult)
.then(() => {
this.mappedResult.should.exist();
this.mappedResult.should.equal('FOOBAR');
//more test
});
});
File: src/helper/validateResult.js
const validateResult = function(){
if (this.mappedResult === 'FOOBAR') {
throw new ValidationError('unexpected status');
}
};
module.exports = validateResult
File: test/helper/validateResult.mocha.js
const validateResult = require('../src/helper').validateResult;
it('should throw ValidationError when mappedResult === `FOOBAR`', function() {
return Promise
.resolve()
.bind({
mappedResult: 'FOOBAR'
})
.then(validateResult)
.catch(function(error) {
error.should.be.instanceof(ValidationError);
});
});
Performance Considerations
There are no free lunches in this world. The niceties Promise
brings to the table comes at a cost. Promise
libraries basically act as a state machine, hence there is overhead.
Let’s see the difference
Running some compute intensive operations processing Math.pow(num, 2)
over 1 million iterations.
Using async library with basic node callback
var Async = require('async');
var numbers = [];
function test(){
for(var i=1; i <= 1000000; i++) {
numbers.push(i);
}
Async.map(numbers, function(num, callback) {
setTimeout(function() {
callback(Math.pow(num, 2));
}, 200);
}, function(err, result) {
console.log('done');
})
}
test();
Result for async library with default Node callback
time node ./promise/none-promise-test.js
done
2.19 real 2.08 user 0.20 sys
Using Bluebird library
var Promise = require('bluebird');
var numbers = [];
function test(){
for(var i=1; i <= 1000000; i++) {
numbers.push(i);
}
return Promise.map(numbers, function(num) {
return new Promise(function(resolve, reject) {
setTimeout(function() {
resolve(Math.pow(num, 2));
}, 200)
})
});
}
Promise
.all(test())
.then(function() {
console.log('done');
});
Result for Bluebird Promise
time node ./promise/promise-test.js
done
2.56 real 2.37 user 0.24 sys
So using regular Node.js call back with the async
library will net you 17% in performance. So you will literally pay 17% more in compute cost in order to sustain the developer ergonomics this control structure provides. Unless the application you are writing is near Facebook, Netflix or Salesforce scale; the actual monetary cost benefits are minimal compared to the engineering resource for cost maintenance day in day out.
For any hot path code like low level server middle ware or client drivers for datastores, callbacks control flow are definitely the way to go before converting the last mile back to a Promise
flow.
Other Considerations
One of the points I made was the frustration of figuring out whether a function returns a promise or not. An easy standard would be to introduce appending Async to the end of your function name for any that returns a Promise
. Such as doSomethingAsync()
.
Understand micro and macro tasks because there are two kinds, which correlates to how the stack is queued up in the event loop for events pushed from a Promise chain versus outside like other I/O.
About the Author: Lam Chan
Lam is a Software Architect for the Locals Squads @ XO Group. He is a seasoned polyglot engineer with over 16 years of professional experience working with startups and multiple fortune 500 companies. When he is away from the office, he enjoys contributing to OSS projects and dabbles with wood working projects. Find out more about Lam on LinkedIn.