Bulletproof Node.js Coding March 21st, 2011
I’ve been actively doing node.js coding for about 4 months now. I’m working with a couple of others on a suite of mobile apps and my time has been split between building the Android client (one of my partners drew the long straw and got ios for this round) and building out the node.js based backend. It’s currently a Node.js + CouchDB + Redis server that combines user auth/account management with real-time signalling between connected clients. The core component, the “sessionserver” exposes no HTML ui and is really just a combination of JSON-based services, background agents and client signalling shims that speak in WebSockets and HTTP long polling.
The details of what I’m building aren’t very important to the topic I want to talk about here, but I mention them because, based on my experience, the sessionserver is a pretty core idiomatic node.js usage. This layer contains no view engine, no HTML templating, no sharing code with clients, etc. It is a raw communication crunching engine, representing a pretty pure node.js use case that may be worth some study. Throughout this post, I’ll be posting excerpts from this project instead of contrived examples as much as possible.
When I first started doing node.js coding, my first thought was “Wow! This is insanely powerful but it is really easy to slice your toes off!” It turned out that was also my second, third, and 150th thought as well! Right around the time that I started the third refactoring/rewrite of the sessionserver, I felt like I had gotten a feel for how to write bulletproof code and I thought it would be worth sharing some of the style and conventions I came to adopt. (As an aside, when learning a fundamentally new and different technology, never expect your first or second attempt to be any good straight out of the gate). It was actually kind of funny: Pretty early on, I found that I was crashing my process so much that I wired it up so that it would play a loud door slamming sound on abnormal exit. I heard that sound enough that it kind of got stock in my head and I found myself humming a melody that I’d made up to the steady beat of the slamming door. Seriously, it was that bad.
I’ve long since been a believer that no matter what the language or environment, developing a bulletproof coding style and conventions for how you approach the code is one of the most critical parts of the learning process. We all know there are an infinite number of ways to write the same chunk of logic, and after a fashion, many of them can even be considered good and reasonable. In my opinion, however, the best styles are those that, when followed, make it difficult or impossible to code most common types of bugs. Some of the most powerful features of a language or environment can also be the most deadly when misapplied. A bulletproof style will balance these features so that you get all of the power but it is difficult to abuse. In addition, dangerous, high-octane areas are properly cordoned off as such, and the style will also fill in for some of the inherent weaknesses of the language.
There are several “macro” solutions to writing more robust node.js code:
- node-fibers: Adds the concept of “fibers” to node so that asynchronous code can be written in an imperative style.
In addition, I’ve come across some library level patterns that are also good if applied in the right context:
- Promises: There are various promise libraries floating around. While from before my time, there was promise support in the very early days of node core. Now its just a pattern people apply if they want to.
- Tim Caswell’s “Do” library
All of these are quite good and worth looking into. I generally prefer solutions that work with the toolset instead of trying to replace it, and the library solutions certainly fit the bill. Be careful about picking your core metaphors, however – they will stick with you for the life of your software.
My goals for writing these tidbits down is to share what I’ve learned and to stimulate a conversation about good node.js programming practices. If you agree or disagree with anything I present, either leave some comments or start a discussion on the node.js mailing list. We all benefit from talking about this stuff more.
Here are the learnings that I’ve taken away from my odyssey with node.js thus far:
- Return on the last statement
- Put your callbacks in sequence
- Define a respond function for complex logic
- Centralize your exception handling
- Embrace a functional coding style with futures or promises
- Differentiate between system interfaces and user interfaces
- Examine dependencies closely
- Prefer copying simple, idiomatic code locally
- Read the source but code to the docs
- Write good tests
1. Return on the Last Statement
This one’s easy but it happens everywhere. How many times have you done something like this:
The problem with this code is that on error, you are calling your callback with the error and the result (most likely null/undefined). This is almost always a violation of the declared API and will cause all manner of badness to happen on error. Making it worse, error paths are notoriously under-tested. You will almost certainly be hearing the door slamming in response to this one. While its easy to spot in a simple function like this, many real world cases are not so obvious. You could choose to just add a “return;” after “callback(err)”, but there is a better way if you can get your eye used to seeing it.
Look around on github for 20 minutes. I bet you can find instances of this class of error in places that will really make you worried (although you may find fewer – I have emailed authors when I’ve found this pattern after reviewing their code).
2. Put Your Callbacks in Sequence
3. Define a respond function for complex logic
If you have a standard node callback-based function with more than two ways to complete (one for errors and one for successful results), consider defining a secondary “respond” function to guard against hard to find situations where your mild-mannered control logic finishes more than precisely once.
This is an older example and I don’t generally hold it up as a bastion of good code. In particular, I should have broken out more named callbacks to distinguish between the synchronous and asynchronous parts of the flow. The key thing to note, though, is that there are multiple “ways out” of the function where the callback can be invoked. Instead of adding logic everywhere to determine if we’ve error’ed, responded yet, or determining if the callback is even defined, I use an explicit respond(…) function locally which invokes the callback with the results and then clears it so it won’t be invoked again. An even better solution would have been to add a warning if invoked more than once.
The rule here is in the same vein as those that come before. If your function is simple, keep it simple and don’t add an explicit respond function. However, if the control flow is getting a little dicey (and itself cannot be simplified), protect yourself by making the callback an explicit local respond function.
4. Centralize your exception handling
Functional programming in node is a lot of fun, expressive and compact except for one part: exception handling. I don’t really see this talked about that much, but in my opinion the lack of a coherent way of dealing with errors and exceptions is node’s biggest weakness. Node-fibers takes the approach of switching to a completely imperative style to achieve this, but I prefer to stay with a functional style and define a coherent exception handling structure.
I could write an entire post just on this topic (and maybe will one day) but I’ll just cover the high points here. The problem with error codes (which node core is based on) is that for higher level logic, the code that detects the error (ie. the first responder) is almost invariably not in the right position to determine what to do about the error condition. This is where try/catch structures in threaded systems make more sense. Someone up the stack will typically know what to do about the error.
For this tip, you are going to need library support of some kind. What is needed is a way to define a Block with an Error Handler and be able to tear this off and take it with you when your callbacks go into foreign territory. Then when they raise an exception, the exception gets routed back to the Block that was in effect when the callback was sent out to do its master’s bidding. I found that most of the solutions out there munged Futures, Promises, Fibers, etc together with this simple need to define an exception handling Block. The following snippet defines a Block class that fulfills what I’m looking for:
This example also uses a Future class which I’ll cover in a bit. The key thing to keep in mind is that any exception thrown by code in or called by the process() function will be routed to the rescue handler (in this case next). In order to get a callback into the block scope, it should be wrapped by calling Block.guard(originalFunction). This will capture the current Block at the time that Block.guard is called and reestablish it for the duration of any call to originalFunction. The Future class does this internal to the force(…) call, which allows me to rest certain that anything I place as the target of a force(…) will have its exceptions routed appropriately. More on that later.
Here’s an example of explicitly capturing the block in your callbacks. In this case, we are invoking an HTTP request, accumulating the text results and resolving a Future with a constructed CouchResponse object (which does some parsing and other things that could conceivably throw an exception).
There are still a couple of places in this example where an unexpected exception would crash the process:
- directly within the “function(res)” callback
- in the ‘data’ callback
I could have wrapped Block.guard statements around these bits as well but chose not to because it costs a little extra and I am 100% confident that a failure here is a critical breakage and is completely covered by unit tests. The ‘end’ handler, however, does some stuff that I can’t immediately see (and I happen to know it contains a JSON.parse call) so I protect it with guard. Finally, I use the block’s standard errorHandler() callbacks to catch request and response error events. I’ve found that this simple pattern of centralizing exception handling makes it very easy to visually understand where exceptions are going and route them at the levels where it makes sense. You can also nest calls to Block.begin. This is useful in framework code that needs to go off and do some other work in response to something the Block initiated but not intrinsically owned by it.
5. Embrace a functional coding style with futures or promises
I actually like node’s callback style a lot for low level stuff — you know for those times when you feel like coding in C is the right thing to be doing and you’re thankful that someone let’s you operate at that level but without malloc/free. For higher level logic/abstractions, though, I prefer something with a bit more functional heritage. A lot of people have used Promises which are just a construct for converting a callback into a return value. You return a Promise instead of invoking a Callback and then you can ask the promise to give you its result. Future’s are similar as far as metaphors go and I prefer them. A Future has two intrinsic operations: resolve and force. Resolve sets the value on the future and force either gets the value if it is immediately available or gives it to you later when it is available. Given the Block based exception handling I illustrated above, my Future class doesn’t really need to think much about capturing and propagating exceptions, so its pretty simple. It does build on the Block by making sure to call Block.guard(…) to wrap any functions that are bound to be invoked as callbacks later by force(…). Here’s the class:
The key advice here is not necessarily to use my Future class, but to use someone’s Future or Promise implementation. I like mine because it is so brain-dead simple and integrates with Block.guard so that when I’m scanning my code and I see a function being passed to a force() call, I can mentally tell myself “this function is safe for exceptions to be thrown from.”
There are examples of using the Future in the previous sections.
6. Differentiate between system interfaces and user interfaces
This one is more of philosophical advice about using the right tool for the job. Some things are best done with node’s callback(err)/EventEmitter machinery and sometimes its better to use a higher level abstraction like a Future/Block. Don’t be afraid to use both. I tend to use the lower-level machinery for stuff that is interfacing with the system. For some reason it feels right to me to be passing around error codes in these situations, but this probably has more to do with the time I spent in C hacking on the Linux kernel than anything else. If you’re writing code to be consumed outside of your project, make sure it speaks the callback(err)/EventEmitter pattern since that is the lowest common denominator that every node programmer on the planet is going to intrinsically understand.
7. Examine dependencies closely
You can get a little cavalier in threaded environments like Java, Ruby or Python when it comes to relying on third party bits. After all you can always just catch Throwable right? Remember that in Node, everything you put into your project and call has the very real potential to kill you. Don’t just run the tests and assume a happy future. Look at the code and make a critical evaluation. If you get the feeling like its playing fast and loose with control flow, it probably is — and it might just kill you. Also, and I mean this with all respect to the node community, do not rely on popularity of a module to assume that others have given its internals a critical evaluation. Remember too that most of the node modules floating around on GitHub started as internal bits for someone else’s project and they have built-in assumptions to those ends.
I don’t mean to be too melodramatic here, but the point is simple: pulling in an external dependency is a lot more like inviting someone into your bed than into your living room. There are lot’s of great things that can come from it, but just be safe about it.
8. Prefer copying simple, idiomatic code locally
This runs counter to most of my experience in other environments and it might not hold up over time as the ecosystem evolves. For now, however, I generally prefer to take simple external dependencies, copy them locally and modify vs trying to share. There’s just no reason why we need to have one “copyObject”, “clone”, etc to rule them all. Find one that does what you want, make sure you understand it, stick it in your project and use it with a local require (require(‘./myCoolObjectCopy’)).
9. Read the source but code to the docs
The great thing about node is that the code is flayed open for all to see. And with most of the modules out on GitHub, its just a few clicks before you are reading anything. Just remember that all of those interesting bits in the source code are not necessarily part of the public api. Rely on the docs for what you are supposed to be calling. If you see something internally that you think should be part of the public api, email the appropriate people and ask/make a suggestion.
10. Write good tests
Really, however non-optional they may have been in other environments, they are not optional here. There are quite a few testing frameworks about, but I tend to use nodeunit. Here’s a simple one to get you started:
For some reason, I always include a ‘test for smoke’ that does nothing as my first test. If there’s a parse error or some other setup problem, then its pretty obvious on the console because I’ll see the error and the line that says “test for smoke” ran successfully won’t be there.
Here’s my runtests.js file. I just customize this slightly (to add require paths, etc) and drop it into any project.