This post is just a dump of a set of thoughts that have been running around in my brain.
All three major cloud platforms, Google App Engine, Amazon Web Services and Microsoft Azure, offer a way to run worker tasks. Google recently introduced cron jobs. Amazon has Elastic Compute Cloud. Azure has the worker role. Each of these mechanisms works in a similar manner: a process looks somewhere for work to do (in the distributed data base, a work queue, or elsewhere) and then performs the task. I don’t have any issues here-this all makes sense. You need a headless process to take input and produce output. This is a staple of most systems I have worked on. I have another issue-how much is this going to cost me? While many people are busy climbing the Gartner hype cycle and are close to the “Peak of Inflated Expectations,” I chose to enter at a personal “Trough of Disillusionment.” I believed the hype with SOAP, worked with the leaders on WS-* at Microsoft, and taught WCF for a while. In the process, I grew up. I’m working on living on the “Slope of enlightenment” as I learn what the platforms do and do not do. (Hopefully, I’m not fooling myself!)
At this point, I’m investigating when it will make sense to process worker information locally vs. in the cloud. At the end of the day, it comes down to costs. Of the big three, only Microsoft remains to announce their pricing, and those numbers will come out by the end of the year (2009). So, what does it cost per hour?
Amazon: $0.03/hour, $21.59/month
AppEngine: $0.10/cpu hour.
Microsoft has not announced costs. However, I have been advised to monitor the metrics Microsoft collects as those will likely be the things Microsoft uses to determine bills. Over the last 24 hours, I have had a WebRole and a WorkerRole running constantly. The metrics chart shows me consuming 2 virtual machine hours/hour. I’m fine with this so long as the baseline cost is competitive with web hosting. I’d probably spend as much as $20/month per VM in use for a given role to use this model. That’s the value to me for being able to hit web scale if and when my site gets to be popular. It’s hard to build in scalability, so I don’t want to face a rewrite/refactoring when moving from a web host environment to a cloud environment. I pays to start right.
If your current processing loads are at ~33% CPU usage, Amazon and Google are equal. However, if you have a new site where you processing usually finds 5 or 6 items waiting, processes those in a few seconds, and then waits 5 minutes, AppEngine might be a LOT cheaper. On a moderate transaction web site, you may only do 10 minutes of processing per HOUR, bringing your cron cost down to half the cost of Amazon.
It looks like Microsoft will be following the Amazon payment model. They will need to have a way to bring costs in line with AppEngine. I would prefer to see a model that bills me by CPU/Processor hour instead of VM hour. A VM hour can have very little usage whereas high CPU hours can adversely impact other VMs running on the same machine. Ideally, a cloud box would balance out based on required CPU, not number of machines, so these metrics should be available to those who run Microsoft’s data centers.
Thanks to all of my readers for sending great questions in e-mail (scott@scottseely.com) or via comments on the blog.