Archive for February, 2009
I have found a need to do some research across the various cloud offerings so that I get good feel for what each has to offer. At this point in my investigations, I am focusing on only three platforms: Amazon Web Services, Microsoft Azure, and Google App Engine. The three have common sets of features: storage through an API, compute resources, and ability to respond to demand by scaling application instances. The storage APIs encourage scalable patterns over patterns that could cause data contention. Amazon requires that the application handle scale up and down on its own. Azure and App Engine scale for the user through a combination of configuration and observed demand. These services also offer authentication services as well as the ability to create your own authentication. App Engine integrates with Google logins, Azure works with Windows Live, and Amazon through security mechanisms in the Amazon Machine Instances.
At this point, the platforms start differentiating. Google App Engine (GAE) requires you to write all code to be executed on the platform in Python. To help you out, they provide a number of Python libraries to build applications. All applications receive input over HTTP. No queuing support exists. Google has some infrastructure to allow for high speed, shared, non-persistent cache as well as facilities to send e-mail and manipulate images.
Azure requires all applications that run on it to use .NET work happens in worker roles and pages are served by ASP.NET. Microsoft also includes a variety of ways to store information based on the scenario: Live Mesh for synchronization across devices; Sky Drive for sharing amongst friends; SQL Data Services for scalable application databases; and Azure Storage Service for blobs, queues, and non-relational tables. Coming soon are locks, a caching layer, and other features. Azure integrates with .NET Services (Service Bus, Access Control Service, and Workflow Services) and Live Services (a set of services to help build social and other application types). It appears that the primary integration point is "All offerings support HTTP/REST models."
Finally, there is Amazon Web Services (AWS). AWS has the most mature of all the offerings, having started their offerings in 2002. AWS divides their offerings into several groups: Infrastructure, Payments & Billing, On-Demand Workforce, Web Search & Information, and Amazon Fulfillment & Associates. The infrastructure group of services allows one to build scalable applications. It includes services for storage (S3), a scalable relational database (SimpleDB), a compute platform (EC2), a queue service (SQS), and a content delivery network (CloudFront). The remaining services allow a small entity to build a big business, track traffic patterns, or get assistance from people to perform tasks.
Of all the services, only Amazon charges money for all usage– though low usage means very little in overall cost. Google intends to not start charging until a sites receives 5 million monthly page views. Azure has yet to announce a pricing model. There seems to be consensus amongst Microsoft people I’ve spoken to that the service will be free for experimentation. I would expect that expenses might kick in using different metrics than Google. I fully expect that Microsoft will gate charges based on some threshold for bandwidth, storage, and compute time. There is no reason for this guess other than Amazon has a pricing model that uses these parameters and Microsoft plans on being "competitive."
I’m cheap, watch a lot of network programming, and all my favorite cable shows are on hulu or elsewhere on the web. To save a few bucks, I spent the week between Christmas 2008 and New Years 2009 upgrading my house to digital TV and canceling cable for all but my Internet connection. My home’s primary TV is attached to a Windows Vista based Media Center. Something that I’ve hated since switching was the craptacular viewing guide. I had a hard time believing that Microsoft hadn’t put out an update to Media Center where they could handle the new channel format. While looking for solutions today, I found out about something called TV Pack 2008 (yeah– I’m a Media Center user, not a fanatic. I’m late to this party…). The more I read, the more aggravated I got that my Windows Vista installation didn’t get this upgrade. You see, the upgrade included an update to the guide that allowed one to find out what was happening on all the local digital channels. Microsoft has put this out for OEMs only. Enthusiasts, like me, weren’t given access to this stuff. While I’m not a fanatic about Media Center, I build all my own PCs (newegg.com loves me!). Part of the upgrade involved installing a digital, over the air tuner. It’s a minor note in the articles I’ve read, where most writers focus on the clear QAM cable and over the air enhancements for England and Japan. Still, cheap guys like me care about the US digital TV enhancements.
Anyhow, I eventually found a site that contains links to the files so that I too could use the updates. http://digiex.net/guides-tutorials/699-windows-media-center-tv-pack-2008-download-installation-guide.html.
First, the good news: this works. The guide shows me everything– no more "missing data" on channels. The bad news– any scheduled shows will be forgotten. That’s OK, just write down your schedule and then add things back in. If you miss something, well, the web will have it or the show will repeat.
It took me about an hour to install the updates and get my shows scheduled back in. My little ones will be happy that the guide now knows when Arthur is scheduled.
I prefer the term utility computing to cloud computing. People outside of software development understand that electricity, water, and phone service are all utilities. Cloud computing is an attempt to deliver compute resources at utility prices. Before I define what utility computing is, I want to define what utility computing isn’t.
Some companies, such as IBM, are trying to do a "me too" with cloud computing. They use the term cloud computing because that amorphous word, cloud, does not have a clear meaning. They are also confusing the computing public. Why? They are conflating their virtualization products with utility computing in order to confuse the market and continue making sales (note: most big iron and *nix vendors have EXCELLENT virtualization technology). Virtualization of resources and compute resources is an ol+++++++++d story that mainframe vendors have had working well since the 1960s. More recently, companies like VMWare, Citrix,and Microsoft have made virtualization of compute resources common. Virtualization lets a customer remain ignorant of what the underlying hardware is. With virtualization, one still has control over the operating system that is used to store files, run applications, authenticate users, and communicate with other operating systems. Virtualization lets an entity buy a high powered machine and then load it up with operating system images that get to pretend like they are the only operating system on the hardware. Virtualization presents a simplified view of the hardware, including limited views into memory, CPU, and storage. With virtualization, one worries about access to memory, CPU, and storage. When a vendor like IBM or Sun says you can own your cloud, they mean that you can write applications that can demand to run like in a cloud, but you have to worry about having enough storage, CPU, and memory to get the job done. This space is important for lots of reasons, but it is different from utility computing.
Cloud computing is a mechanism to deliver compute and storage resources where the user does not need to know how those resources are provisioned, who else might be on the same hardware, or what the underlying technology to provide those services happens to be. The closest analogs to utility computing are other utilities: electricity, water, gas, cable TV, Internet, cell phone carriers to name a few. Utilities have a common set of characteristics.
- Picks who provides the utility.
- Is responsible for limiting consumption.
- Views available resources as infinite.
- Procures resources to deliver the utility.
- Decides how the utility is delivered.
- Makes sure that one user does not adversely effect other users.
- Dictates the mechanisms used to consume the utility.
Virtualization does allow for most of these items to appear. Virtualization does not allow for a user to treat available resources as infinite. Virtualization does require for the consumer to also be a provider. If you rent a Windows Server or *nix service as a virtual instance from a machine you can’t see, you have a utility operating system, not a utility compute resource. Your machine still has fixed storage and CPU.
There are many firms saying that they offer cloud computing. Three are very well known:
Of the three, Amazon is the odd man out, offering a hybrid of utility virtual instances, where you can spin up a Windows Server or Linux OS, but that instance can only provide durable storage via the infinite Simple Storage Service, S3.
Utility computing provides a benefit that virtualization does not: utility computing allows you to abstract away the professionals who handle data redundancy, keeping servers running, and adding compute power when needed. When your computing needs indicate that you need more or less, you just take what you need. There is no need to negotiate for extra compute utility or to keep those resources when they are not in use. Virtualization means you have the resources 24/7. Utility computing means you have the resources only when you need them. The rest of the time, you can give back the resources. That’s a strength of utility models. The owner of the resource can be creative around how to handle the demand flow by turning resources on and off. Consumers only have to worry about what they need now and how much they can afford to consume. The rest is automatic and transparent.
Utility providers do make decisions about how you can consume: REST to access storage, Python/.NET/something else to write applications, types of databases that work (hint: RDBMS doesn’t scale, and utilities know this), and types of storage. These decisions help you create applications that can take optimal advantage of the utility resources.
This change in computing will require people to learn a different way to develop applications. People will not like the changes until more successes happen. In the end, utility computing is going to succeed and will work hand in hand with virtualization.