Sequence Diagrams for Asynchronous Flows in Node.js
Ben Kittrell
06/13/2012 05:09PM
If you're getting started with Node.js, like I am, chances are you're having a hard time grokking asynchronous workflows. I'm typically a "just dive right in and kick it till it works" kind of guy, but I've been trying to discipline myself a little more and do some planning. After trying some different techniques I found sequence diagrams to be very helpful. I guess I did learn something from all of those years of corporate Java development.
Sequence Wha?
For those unfamiliar, in a sequence diagram, the vertical lines represent components in your application or architecture and the horizontal lines represent messages between them. The white vertical rectangles represent the life that particular component.
Examples
I'm working on a Node.js media server. Here's a diagram of a potential slave replication flow.

Right away you can see that there is nothing asynchronous about this. The browser has to wait for a lot of stuff to finish before the connection is released. In reality, the connection could be released as soon as the file is saved.
Here's an improved flow, with the addition of video transcoding.

The file is uploaded, and an asynchronous video transcoding job is started. Right away we replicate the original file to the slaves. At this point, everything is in the hands of the transcoding job. When it's complete, we download the completed files, set the metadata such as the duration, replicate to the slaves and set the status as finished.
You can easily see that the browser session is very short, which is what we want in an asynchronous flow. As soon as the uploaded file is saved to disk, it returns a 200 and then starts the a video transcoding job. Also, we can start copying the original file to the slaves before the transcoding is finished.
It's also clear that some pieces still need to be synchronous. We can't replicate the the slaves until we have the files and the metadata, and we can't set the status to finished until the files are replicated.
However, it also shows a weakness in this flow. The only thing that can complete the flow is a message back from the video transcoder. If that message is never sent, we have an orphaned file. This suggests that we may need something to check for failed jobs and retry.
I used OmniGraffle for these, but you can also just draw them out on paper.

Comments
http://www.websequencediagrams.com/ ???
this doesn't seem like it would scale if the master is receiving files from the browser. the master would be a bottleneck if you plan on having multiple users upload files.
slaves should be receiving the files and either queueing jobs for the master to handle and replicate, or they should be doing the work themselves and be aware of other slaves in the array that they need to replicate to.
Patrick, Nice Link!
Langdon, That's a good point, and something I'll consider. One part of the design that isn't in this article is the load balancing. Let's say I need to get the URL for an <img tag. The app that generates the URL will have a "Registry" from the media server that tells it the host names of the master and all of the slaves. If the app knows that the file is "Finished" (replicated) then it will choose a random URL from the registry. Otherwise it uses the Master URL. Too your point that would add even more stress to the master.
I guess I could have the app keep track of where it uploaded the file, instead of just assuming it was the master. In that case there would be no need for a master, just a cluster of nodes. Something for me to chew on.
JavaScript is very good at handling variable args :)
[code]
function foo(){
alert(arguments.length)
}
foo(1);
foo(1,2);
[/code]
I'm very happy to discover this great site the future of this discussion is getting good and more useful for me. Thanks for sharing. http://www.recyclingsystemsinc.com/dumpster-rental-glenview/
Post a Comment