An Intro to SimpleDB

In writing the PhotoWeb application for Amazon Web Services, I took advantage of the fact that an EC2 instance is just a VM running on top of Xen. Because it is just a VM, the entire application can be developed and tested on your local dev box before deploying to EC2-I highly recommend doing as much development and testing on your local machine before deploying to the hosted environment. I decided to start out by handling the authentication piece first. Because hosted SQL Server on Amazon costs extra, I used SimpleDB as the datastore.

Before covering how to use SimpleDB, I need to explain what SimpleDB is. As a database, SimpleDB has a lot more in common with old school ISAM databases than RDBMS systems like Oracle, DB2, and SQL Server. To store data, you first need a domain. A domain is a collection of data. You can add data to a domain, query data within a domain using a SQL like syntax, and delete data from a domain. The data items within a domain may have heterogeneous structure. Data is added by setting a key for the Item and passing a collection of name/value pairs where the values need to be able to transform to and from a string. Everything in the database is indexed automatically. Item names need to be unique-all other attributes are just indexed to make queries efficient.

You do have some limits with SimpleDB:

  1. By default, you get 100 Domains. If you need more, contact Amazon and ask for more.
  2. Each domain can contain a maximum of 1 billion attributes.
  3. Each domain can be no larger than 10 GB.
  4. Each item can only have 256 attributes.
  5. An attribute name and value must be less than 1024 bytes each.
  6. Queries must run in less than 5 seconds or they are abandoned.
  7. Each query can only have 10 predicates and 10 comparisons.
  8. A select expression can only specify 20 unique attributes (use * to get more/all).
  9. Responses are limited to 1 MB. (Paging tokens are returned to get the complete result set using many responses.)

Doing the math, it means that you can have a database of 1 TB over 100 domains before you need to ask for more space. SimpleDB supports eventual consistency of data. Each copy of the data is stored multiple times. After adding, updating, or deleting data and Success is returned, you can know that your data was successfully stored at least once. It does take time for all copies to be updated, so an immediate read might not show the updated value. This means you need to design your applications to remember what was stored instead of hitting the database for the up to date information.

Any requests against SimpleDB require a token and key to make sure that only an authenticated identity is touching the data store. You can obtain a token and secret by going to http://aws.amazon.com/simpledb/ and clicking on Sign up For Amazon SimpleDB. Then, select Resource Center. Under the Your Account menu, click on Access Identifiers. From there, you should see something that says Access Key ID and Secret Access Key. Once you have this information, you can access SimpleDB, Simple Storage Service, and EC2 (and other services, but we won’t be using them for this application).

Next time, we’ll build something with SimpleDB. In F#. Using a C# library. Oh yeah!

Leave a comment

Developing PhotoWeb on Amazon Web Services

Back in February, I walked through the development of a Photo storage application. The application originally comes from one of the examples in the REST book Kenn Scribner and I wrote, Effective REST Services via .NET. Photo sharing and uploads allow for me to present a well understood application without providing a lot of background. For a photo, you upload it somewhere and store metadata about the photo itself. We already covered Google App Engine in February.

For this application, we will use Amazon Web Services, including SimpleDB, Simple Storage Service, and Elastic Compute Cloud. At the end, I’ll tell you what I thought of the experience. I’ll develop the application in F#. When I presented this code to the Midwest Cloud Computing Users Group for the April 2009 meeting, Amanda Laucher offered up that my use of F# used some idioms she hadn’t seen before. That’s a nice way of saying “You appear to be a n00b.” Please keep that in mind whenever you review my F# code.

I’ll also be showing a few helper libraries that are OpenSource (or similar). A nice thing about AWS is that the C# OpenSource community has come out and done a great job putting out great tools.

For EC2, I used the application at https://console.aws.amazon.com as well as Windows Remote Desktop to setup and manage the EC2 instance. I use SimpleDB as a custom MembershipProvider for the ASP.NET application. Next time, we will look at how that provider was created.

Leave a comment

Cloud Computing User Group and Chicago Code Camp

To everyone who came out to the Midwest Cloud Computing User Group (Chicago/Downers Grove), I want to thank you for coming and listening how the various cloud platforms work. For those that missed the talk, I’m busy turning the comparisons into more detailed blog postings. I’ve probably got 2-3 months worth of posts that will present more details. Those posts should start showing up next week Monday (4-13-2009).

I also mentioned that I am helping organize a Code Camp on Saturday, May 30, 2009 at the College of Lake County, Grayslake Campus. This Code Camp will have talks on .NET/Microsoft topics, Test Driven Development, Python, Google App Engine, Ruby, and more. Please visit www.ChicagoCodeCamp.com for details, to submit proposals, and to figure out how to register.

Leave a comment

AWS Management Console is nice

The Amazon Web Services Management Console is SIGNIFICANTLY easier to use than the command line applications one must use to manage an EC2 instance. I had puzzled out all the commands and got things working. Tonight, I stumbled across the AWS Management Console: https://console.aws.amazon.com/ec2/. It’s a nice, point and click interface to setup an Amazon Machine Instance (AMI), bundle that instance into a bucket, figure out connection credentials, setup common security groups, etc. The interface has the advantage of telling me exactly how far along the system has gone with each step. This was particularly nice when bundling my AMI. Without the feedback, I wouldn’t have known what was going on for the 20 or so minutes where the instance was paused and packaged. The service even handles storing the image in your S3 account.

I also had to install IIS tonight. The console made it easy to create a volume based on the Windows CD from the list of already stored volumes and attach that volume to my running instance. For something that looked tough, the tool made things easy (ok easier).

For what it’s worth, if you are building Web Applications, AWS EC2 has the hands down WORST initial deployment experience (things get a lot better once the VM is up and running). Azure takes about 15 minutes from build to first deployment. App Engine takes about 2 minutes.

Leave a comment

Filtering in a sequence

It turns out that when many mechanisms exist to do a task, many things will work. Here is another way to handle the Seq.fold task from yesterday in a much cleaner way. It’s still neat to see all these workable options! Seq has another method, filter, which only returns those elements that return true for the supplied function.

#light

 

open System.Collections.Generic

 

type MyType(propA:string, propB:int) =

    member this.PropA = propA

    member this.PropB = propB

    override this.ToString() = System.String.Format("{0} {1}", this.PropA, this.PropB)

 

let typeSeq =

    [|  new MyType("red", 4);

         new MyType("red", 3);

         new MyType("blue", 3);

         new MyType("blue", 4);

         new MyType("blue", 5);

         |]

 

let filterVals theSeq propAValue =

     theSeq |> Seq.filter(fun (a:MyType) -> a.PropA = propAValue)

 

printfn("%A") (filterVals typeSeq "red")

printfn("%A") (filterVals typeSeq "blue")

Leave a comment

I’m Presenting at Chicago Cloud Computing User Group

I received some great feedback on my survey of cloud computing platforms at the February Cloud Computing User Group in Downers Grove, so I’ve been asked to bring the show to the downtown Chicago meeting for the April meeting on Thursday, April 9 at 5:30 at Microsoft’s downtown location. If you saw the talk last month, I’ve been asked to beef it up with real world code running on all three major platforms: Google App Engine, Amazon Web Services, and Azure. It’s a simple photo sharing application based on the same code I showed in February for App Engine. Please sign up at https://www.clicktoattend.com/invitation.aspx?code=136727.

The full meeting description:

Join us for the fourth local meeting of the Cloud Computing User Group – this months being held at the Microsoft offices in Downtown Chicago. Note : this meeting will be a revisit and slight revamp of last month’s content. As before, we will be learning about how Live ID integration works in the Azure cloud computing platform. We’ll demo and dig into the code of an application built in the cloud that integrates directly with the Live ID service and stores information specific to the individual associated with that ID.

Also, as before, Scott Seely, Architect at MySpace, will kick off the meeting with a 1-hour overview of the top three cloud computing offerings available today: Google App Engine, Amazon EC3 and Azure Services. What Scott will also demo during this meeting is how to actually implement code one each of the three platforms, showcasing application hosting, storage and data retrieval.

Again-you can register at https://www.clicktoattend.com/invitation.aspx?code=136727. I hope to see you there!

Leave a comment

Another use of Seq.fold: filtering

Last night, I had to filter a bunch of entries in a sequence down to only those items that had a particular characteristic. My thought process went like this: “I have a list of objects. When I’m done, I want a list of objects that have property X, discarding everything else. Ah-ha, I want to fold the sequence into a List.” I had some other constraints, such as the other library I was using was written in C# and used List<T> instead of F# types. I don’t know that this is the best way to do things, but I’ll show it anyhow as an illustration of nifty stuff I’ve done lately.

The code takes an array of some well-defined type and then filters on that type. In this case, the type has a formalized string * int tuple with names to make it useful from C#. You can place any well defined C# data type in place to get equivalent effects. For example, imagine sorting some Person or Order business object in your F# code looking for People named Fred or Orders over $100. In our case, we have the type MyType as a stand-in. The code filters on the quality of the object and returns only those instances with that quality.

#light

 

open System.Collections.Generic

 

type MyType(propA:string, propB:int) =

    member this.PropA = propA

    member this.PropB = propB

    override this.ToString() = System.String.Format("{0} {1}", this.PropA, this.PropB)

 

let typeSeq =

    [|  new MyType("red", 4);

         new MyType("red", 3);

         new MyType("blue", 3);

         new MyType("blue", 4);

         new MyType("blue", 5);

         |]

 

let filterVals theSeq propAValue =

     theSeq |> Seq.fold(fun (acc:List<MyType>) (a:MyType) ->

                        if (a.PropA = propAValue) then

                            acc.Add(a)

                        acc) (new List<MyType>())

 

printfn("%A") (filterVals typeSeq "red")

printfn("%A") (filterVals typeSeq "blue")

This code generates the following output in the interactive window:

seq [red 4; red 3]
seq [blue 3; blue 4; blue 5]

I like this syntax bit better than the LINQ syntax as I can achieve a typed collection in a trivial manner. I’m also enjoying the notion of moving away from explicit looping constructs and towards simple functions that just do the right thing for me.

Leave a comment

Full transition off of Comcast

I had a lot of issues with Comcast and their customer service. The last straw came 3 weeks ago when I got a bill indicating that they wanted money for a service I had cancelled on January 5, 2009. I had spent a total of 22 hours on the phone or going to their offices in person to explain that I only wanted Comcast Internet service-no phone, no TV. Their billing system was beyond repair and I just don’t have the time to fix my bill every month that they f*ck it up. I’m using AT&T Internet only (no land line) DSL service. Hook up was painless. Speed is well in the land of good enough-I can watch TV shows on Hulu and listen to online radio stations just like before. Real big downloads (think downloading an MSDN copy of Visual Studio 2008) only moves at 275-300KB/second. Comcast used to give me 1500KB/second.

Oh yeah, AT&T is also 33% cheaper (or Comcast is 50% more expensive). Since the start of the year, my monthly home communication bill has gone for $390/month to $238 month as of this last switch. Please note that I could be doing better if I didn’t feel the need to be a mobile computer geek:

  • 4 Cell phones on a 1400 minute/month plan. (1 phone is also the house phone, hooked up to the internal wiring.): $89.99 + 2 * $9.99 = $109.97 * 0.8 (standard deal offered if you belong to an organization that has a discount setup with AT&T Wireless).
  • Unlimited texting on all phones. $30.
  • Tethering/data plan on one phone for mobility (3 Mbps 3G connection for when I’m not at home but still need to work). $65
  • Pro DSL Plan from AT&T: $40
  • 11.3% Tax on everything but DSL:$15.26

This represents a 40% reduction in communication costs. I purchased a fair amount of equipment to drop cable and move to cheaper alternatives:

  • $100 for a bluetooth cell phone->home line connector.
  • $129 for a new dual digital tuner for my HTPC.
  • $100 for a new HDTV roof antenna and two signal boosters.
  • $100 for a 1 TB Hard Drive to hold the larger HD content.

Total cost for the update was $429. I’m saving $152/month, putting payback at 3 months. The HD spectrum in my area has enough good TV to keep us happy, so we have lost nothing of importance in this switch.

The point: I won’t miss Comcast and I’m saving a bunch of cash by eliminating things I didn’t need. Also, my home phone number is now full portable.

Leave a comment

SB 1522 From IL 96th Congress Should Be Passed

SB1522 (http://tinyurl.com/c2b4yx) in Illinois sounds promising for IL based startups. Call your IL state senator-http://tinyurl.com/dgpmzw.

What does the bill do? I’m just going to grab the text from the May Report. I’m skipping adding links because any Illinois resident who reads my blog doesn’t need more links to click on. Not this time!

A capped ($10 million annually) grant-matching program.  So, for example, if a qualified Illinois tech startup obtains a $100,000 federal Small Business Innovative Research (SBIR) grant, the state of Illinois will match that grant. A capped ($15 million annually) investment tax credit for state-registered and qualified early stage investors.  Under appropriate circumstances, such an investor making an early-stage investment in a technology startup would receive a capped credit against his/her Illinois tax bill.

Why we need it:

Illinois’ failure to translate its world class research capabilities into a vibrant startup community has been documented in depressing detail by numerous studies over the last 10 years. Technology firms created here frequently move to other states-including those in the nearby Midwest-recruited away by SB 1522- like programs.  As a result, brains, talent, and Illinois-taxpayer financed technical discoveries leave our state.

What you can do:

  1. Find your state Senator: http://tinyurl.com/dgpmzw.
  2. Call your Senator and ask him/her to support SB 1522. Stress that:
    • Illinois needs the jobs and tax base that tech firms provide.
    • This bill provides much needed stimulus to our economy, because small companies create 60-80% of new jobs and each tech job creates 3-6 jobs indirectly.
    • The funding is so tiny against the total state budget that it will have little if any impact on the deficit/tax picture.

Please act now! State legislators say that if 3-5 people call them on a bill, they really take notice.  You can make sure these much-needed programs are available to our Illinois entrepreneurs.  Thanks for your help! Again, the URL to locate your state senator is http://tinyurl.com/dgpmzw.

Leave a comment

Moving from Azure Desktop—>Cloud Table Storage Issue

Here’s a small gotcha that I didn’t see covered via the normal Google coverage. So, I’m adding the information and the solution so that I can find the answer when I need it again. I’m sharing via my blog to help you out too. If this helps you, click on a link and send some change my way:)

Symptom: You follow the rules and ran “Create Test Storage Tables” from Visual Studio on your dev box. All your testing locally seems to work. When you deploy, you see an error like this (leaving in lots of Google discovery goodness in here. If this saves your bacon, send me a thank you!):


<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<error xmlns="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata"&gt;
<code>TableNotFound</code>
<message xml:lang="en-US">The table specified does not exist.</message>
</error>

Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.
Exception Details: System.Data.Services.Client.DataServiceClientException: <?xml version="1.0" encoding="utf-8" standalone="yes"?>
<error xmlns="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata"&gt;
<code>TableNotFound</code>
<message xml:lang="en-US">The table specified does not exist.</message>
</error>
Source Error:

An unhandled exception was generated during the execution of the current web request. Information regarding the origin and location of the exception can be identified using the exception stack trace below.

Stack Trace:

[DataServiceClientException: <?xml version="1.0" encoding="utf-8" standalone="yes"?>
<error xmlns="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata">
  <code>TableNotFound</code>
  <message xml:lang="en-US">The table specified does not exist.</message>
</error>
]
   System.Data.Services.Client.<HandleBatchResponse>d__1d.MoveNext() +1294

[DataServiceRequestException: An error occurred while processing this request.]
   System.Data.Services.Client.SaveAsyncResult.HandleBatchResponse() +391100
   System.Data.Services.Client.DataServiceContext.SaveChanges(SaveChangesOptions options) +177
   Microsoft.Samples.ServiceHosting.StorageClient.<>c__DisplayClass1.<SaveChangesWithRetries>b__0() in C:UsersScott SeelyDocumentsVisual Studio 2008ProjectsAzureSamplesStorageClientLibTableStorage.cs:1227
   Microsoft.Samples.ServiceHosting.StorageClient.RetryPolicies.NoRetry(Action action) in C:UsersScott SeelyDocumentsVisual Studio 2008ProjectsAzureSamplesStorageClientLibBlobStorage.cs:220
   Microsoft.Samples.ServiceHosting.StorageClient.TableStorageDataServiceContext.SaveChangesWithRetries() in C:UsersScott SeelyDocumentsVisual Studio 2008ProjectsAzureSamplesStorageClientLibTableStorage.cs:1215


The critical part to fixing this was to make sure that my tables were actually initialized prior to running any queries. The class that holds my IQueryable so that the table storage actually works now has an Init() function. The class itself now reads:

    public class AzureMembershipDataContext : TableStorageDataServiceContext

    {

        public AzureMembershipDataContext() :

            base(StorageAccountInfo.GetDefaultTableStorageAccountFromConfiguration())

        {

            Init();

        }

 

        private const string TableName = "AzureUserTable";

        public IQueryable<AzureUserData> AzureUserTable

        {

            get

            {

                return CreateQuery<AzureUserData>(TableName);

            }

        }

 

        private static bool _initalized = false;

        static void Init()

        {

            if (!_initalized)

            {

                StorageAccountInfo storageAccountInfo =

                    StorageAccountInfo.GetDefaultTableStorageAccountFromConfiguration();

                TableStorage.CreateTablesFromModel(typeof(AzureMembershipDataContext), storageAccountInfo);

                _initalized = true;

            }

        }

 

        public void Add(AzureUserData data)

        {

            base.AddObject(TableName, data);

        }

    }

And yeah-that code to actually CreateTablesFromModel in Init() is super important. Without it, you can’t do anything. Make sure the model is created from the type containing your IQueryable-going off of the table type doesn’t work with the API.

Leave a comment