An Intro to SimpleDB


In writing the PhotoWeb application for Amazon Web Services, I took advantage of the fact that an EC2 instance is just a VM running on top of Xen. Because it is just a VM, the entire application can be developed and tested on your local dev box before deploying to EC2-I highly recommend doing as much development and testing on your local machine before deploying to the hosted environment. I decided to start out by handling the authentication piece first. Because hosted SQL Server on Amazon costs extra, I used SimpleDB as the datastore.

Before covering how to use SimpleDB, I need to explain what SimpleDB is. As a database, SimpleDB has a lot more in common with old school ISAM databases than RDBMS systems like Oracle, DB2, and SQL Server. To store data, you first need a domain. A domain is a collection of data. You can add data to a domain, query data within a domain using a SQL like syntax, and delete data from a domain. The data items within a domain may have heterogeneous structure. Data is added by setting a key for the Item and passing a collection of name/value pairs where the values need to be able to transform to and from a string. Everything in the database is indexed automatically. Item names need to be unique-all other attributes are just indexed to make queries efficient.

You do have some limits with SimpleDB:

  1. By default, you get 100 Domains. If you need more, contact Amazon and ask for more.
  2. Each domain can contain a maximum of 1 billion attributes.
  3. Each domain can be no larger than 10 GB.
  4. Each item can only have 256 attributes.
  5. An attribute name and value must be less than 1024 bytes each.
  6. Queries must run in less than 5 seconds or they are abandoned.
  7. Each query can only have 10 predicates and 10 comparisons.
  8. A select expression can only specify 20 unique attributes (use * to get more/all).
  9. Responses are limited to 1 MB. (Paging tokens are returned to get the complete result set using many responses.)

Doing the math, it means that you can have a database of 1 TB over 100 domains before you need to ask for more space. SimpleDB supports eventual consistency of data. Each copy of the data is stored multiple times. After adding, updating, or deleting data and Success is returned, you can know that your data was successfully stored at least once. It does take time for all copies to be updated, so an immediate read might not show the updated value. This means you need to design your applications to remember what was stored instead of hitting the database for the up to date information.

Any requests against SimpleDB require a token and key to make sure that only an authenticated identity is touching the data store. You can obtain a token and secret by going to http://aws.amazon.com/simpledb/ and clicking on Sign up For Amazon SimpleDB. Then, select Resource Center. Under the Your Account menu, click on Access Identifiers. From there, you should see something that says Access Key ID and Secret Access Key. Once you have this information, you can access SimpleDB, Simple Storage Service, and EC2 (and other services, but we won’t be using them for this application).

Next time, we’ll build something with SimpleDB. In F#. Using a C# library. Oh yeah!

%d bloggers like this: