Archive for category Uncategorized

Azure Storage is a RESTful service

Today I had to build a demo for Azure and I noticed that I was following a tired old path where one demonstrates Azure storage services (Table/Queue/Blob) via a hosted application. My demo has two key points:

1. Look, there’s a picture that I uploaded!

2. Look, these processes can send messages via the queue!

The kicker was going to be that the messages are exchanged over the Internet, not the demo environment. I wanted a visible resource, an image of penguins, to be visible via a public URL. That was going to be the cool part.Then, I thought- “Well, the Azure SDK is just a bunch of libraries. The libraries should work fine in a console application, right?” Right! 

And guess what: the demo actually works pretty slick because I can demo the storage service in isolation. I don’t need to demo it with a deployed application. That helps me out, and gives me some ideas on how I can use Azure differently.

I started out with a little utility class that reads a connection string from a config file and returns a ready to use CloudStorageAccount instance.

 

public static class Utility
{
  private static CloudStorageAccount _account;

  public static CloudStorageAccount StorageAccount
  {
    get
    {
      if (_account == null)
      {
        _account = CloudStorageAccount.Parse(
          ConfigurationManager.AppSettings["storageAccount"]);
      }
      return _account;
    }
  }
}

The config is just the following (minus the line breaks in the ‘value’ value):

  <appSettings>
    <add key="storageAccount"
         value="DefaultEndpointsProtocol=https;
                AccountName=[your account name];
                AccountKey=[your account key]"/>
  </appSettings>

My scenario is this: I have a directory with images. I want to force those images to be uploaded to Azure blob storage. This needs to happen from the local machine. I was really surprised how easy this is to do. The Microsoft.WindowsAzure.StorageClient assembly has all the code you need to make this work. To upload the images and make them visible to the public, I just wrote the following:

static void Main()
{
  var client = Utility.StorageAccount.CreateCloudBlobClient();
  var dirInfo = new DirectoryInfo(Environment.CurrentDirectory);
  var cloudContainer = new CloudBlobContainer("friseton", client);
  var permissions = new BlobContainerPermissions
                      {
                        PublicAccess = BlobContainerPublicAccessType.Blob
                      };
  cloudContainer.CreateIfNotExist();
  cloudContainer.SetPermissions(permissions);
  foreach (var fileInfo in dirInfo.EnumerateFiles("*.jpg"))
  {
    var blobRef = cloudContainer.GetBlobReference(fileInfo.Name);
    blobRef.DeleteIfExists();
    blobRef.UploadFile(fileInfo.FullName);
  }
}

In this way, you can use Azure blob storage the same way you use Amazon’s S3. The queue can be used like the Simple Queue Service, and table can be accessed liked SimpleDB.

Yes, I understand that this has been possible for Azure all along. It finally clicked in my noggin that developing for Azure really can mean just picking and choosing the parts you want to use. You don’t need to go all in to use the service. Instead, just pick the parts that make sense and build apps!

For those that are curious, here is the queue demo too. It uses an object, Name, to send messages. This could be ANY object, I just picked something simple for demo purposes.

[DataContract]
public class Name
{
  [DataMember]
  public string FirstName { get; set; }

  [DataMember]
  public string LastName { get; set; }
}

It is sent to the queue with this code:

static void Main()
{
  var client = Utility.StorageAccount.CreateCloudQueueClient();
  var cloudQueue = new CloudQueue(
    Utility.StorageAccount.QueueEndpoint + "friseton",
    client.Credentials);
  cloudQueue.CreateIfNotExist();
  var name = new Name {FirstName = "Scott", LastName = "Seely"};
  var stream = new MemoryStream();
  var writer = XmlDictionaryWriter.CreateBinaryWriter(stream);
  var ser = new DataContractSerializer(typeof (Name));
  ser.WriteObject(writer, name);
  writer.Flush();
  var buffer = new byte[stream.Length];
  Array.Copy(stream.GetBuffer(), buffer, stream.Length);
  var message = new CloudQueueMessage(buffer);
  cloudQueue.AddMessage(message, TimeSpan.FromHours(1));
}

And read from the queue thusly:

static void Main()
{
  var client = Utility.StorageAccount.CreateCloudQueueClient();
  var cloudQueue = new CloudQueue(
    Utility.StorageAccount.QueueEndpoint + "friseton",
    client.Credentials);
  cloudQueue.CreateIfNotExist();
  var ser = new DataContractSerializer(typeof(Name));
  var timeToStop = DateTime.Now + TimeSpan.FromMinutes(2);
  while (DateTime.Now < timeToStop)
  {
    if (cloudQueue.RetrieveApproximateMessageCount() > 0)
    {
      var message = cloudQueue.GetMessage();
      var buffer = message.AsBytes;
      var writer = XmlDictionaryReader.CreateBinaryReader(buffer, 0,
        buffer.Length, XmlDictionaryReaderQuotas.Max);
      var name = ser.ReadObject(writer) as Name;
      if (name != null)
      {
        Console.WriteLine("{0} {1} {2}", name.FirstName,
          name.LastName, message.InsertionTime);
      }
      cloudQueue.DeleteMessage(message);
    }
    Thread.Sleep(TimeSpan.FromSeconds(1));
  }
}

And yes, the queue code also works from your local machine. The example above does require you to have the Azure SDK installed.

Leave a comment

Speaking at Chippewa Valley .NET Users' Group Tomorrow

I’ll be doing a beginner’s talk on WCF tomorrow night at the Chippewa Valley .NET Users’ Group. Details are here: http://cvnug.wi-ineta.org/DesktopDefault.aspx?tabid=73. I’m looking forward to meeting some new people!

Leave a comment

Converting a number to an arbitrary Radix

One of the things that is great about integers and longs is that they are easy to remember. However, when using this as identifiers that a human should be able to type in, they leave a lot to be desired. A integer in the billion range requires a user to enter 10 digits correctly. That’s hard to read, hard to keep your place, etc. There is a solution to this issue: represent the data using a radix other than 10. For English speakers, a radix of 36 is easily readable and maximizes the density while allowing for case insensitivity (no one wants to remember if they should type z or Z!).

Consider this, the value for int.MaxValue is written out as:

2147483647

As base 36, it is

zik0zj

6 characters instead of 10. Nice!

To do this, I wrote a simple function that converts a long (64 bits!) to any radix between 2 and 36. This is a basic first or second semester CS problem, I know. Still, this code is handy to have when you need it for converting numbers into something a person can type in:

 

static string ConvertToString(long value, int toBase)
{
  if (toBase < 2 || toBase > 36)
  {
    throw new ArgumentOutOfRangeException("toBase",
      "Must be in the range of [2..36]");
  }
  var values = new List<char>();
  for (var val = '0'; val <= '9'; ++val)
  {
    values.Add(val);
  }
  for (var val = 'a'; val <= 'z'; ++val)
  {
    values.Add(val);
  }

  var builder = new StringBuilder();
  bool isNegative = false;
  if (value < 0)
  {
    value = -value;
    isNegative = true;
  }
  do
  {
    long index = value%toBase;
    builder.Insert(0, values[(int)index]);
    value = value/toBase;
  } while (value != 0);
  if (isNegative)
  {
    builder.Insert(0, '-');
  }
  return builder.Length == 0 ? "0" : builder.ToString();
}

And, to go the other way:

 

static long ConvertToLong(string input, int fromBase)
{
  if (fromBase < 2 || fromBase > 36)
  {
    throw new ArgumentOutOfRangeException("fromBase",
      "Must be in the range of [2..36]");
  }
  if (string.IsNullOrEmpty(input)) return 0;
  input = input.Trim();
  var values = new List<char>();
  for (var val = '0'; val <= '9'; ++val)
  {
    values.Add(val);
  }
  for (var val = 'a'; val <= 'z'; ++val)
  {
    values.Add(val);
  }

  var builder = new StringBuilder();
  bool isNegative = false;
  int startIndex = 0;
  if (input[0] == '-')
  {
    isNegative = true;
    ++startIndex;
  }
  long retval = 0;
  for(int index = startIndex; index < input.Length; ++index)
  {
    retval *= fromBase;
    bool found = false;
    for (int number = 0; number < fromBase; ++number)
    {
      if (input[index] == values[number])
      {
        retval += number;
        found = true;
      }
    }
    if (!found) break;
  }
  if (isNegative)
  {
    retval = -retval;
  }
  return retval;
}

Leave a comment

Reading a WebResponse into a byte[]

This question came up on Twitter. I’m posting the solution here for posterity. How do you read a non-seekable Stream into a byte[]? Specifically, a HttpWebResponse? Like this:

 

class Program{
  static void Main(string[] args)
  {
    var request = WebRequest.Create("http://www.scottseely.com/blog.aspx");
    var response = request.GetResponse() as HttpWebResponse;
    var stream = response.GetResponseStream();
    var buffer = new byte[int.Parse(response.Headers["Content-Length"])];
    var bytesRead = 0;
    var totalBytesRead = bytesRead;
    while(totalBytesRead < buffer.Length)
    {
      bytesRead = stream.Read(buffer, bytesRead, buffer.Length - bytesRead);
      totalBytesRead += bytesRead;
    }
    Console.WriteLine(Encoding.UTF8.GetString(buffer, 0, totalBytesRead));
  }
}

Leave a comment

XmlDictionary and Binary Serialization

One of the interesting things that came out of WCF is the improvements in Infoset serialization. In particular, WCF introduced a format for binary serialization which reduces space concerns for objects. One of the keys to saving space is the notion of an XmlDictionary. The WCF serialization folks asked the questions:

How much could we reduce the size of a message if we allowed the parties communicating to exchange metadata about the messages?

What if we could reduce the size of messages by exchanging aliases for the XML Infoset node names?

The result of this what if experiment is the XmlDictionary and XmlBinaryWriterSession. The mechanism is astonishingly simple. Assume that both ends have a mechanism for exchanging information about what to call the two parts of a QName: name namespace and the name of the node. Instead of sending namespace:element qualified items, send aliases. This works well in WCF messaging and happens whenever you send messages over the binary serializer. You can also use this in your own code that uses a binary serializer. The only requirement is that the serializer and deserializer have to agree on the makeup of the XmlDictionary. Let’s start by looking at some code that does plain old binary serialization.

We start with an object:

[DataContract(Namespace = "http://www.friseton.com/Name/2010/06")]
public class Person
{
  [DataMember]
  public string FirstName { get; set; }

  [DataMember]
  public string LastName { get; set; }

  [DataMember]
  public DateTime Birthday { get; set; }
}

I then have a ‘driver’ program:

 

static void Main(string[] args)
{
  var person = new Person
               {
                 FirstName = "Scott",
                 LastName = "Seely",
                 Birthday = new DateTime(1900, 4, 5)
               };
  var serializer = new DataContractSerializer(typeof (Person));
  Console.WriteLine("Serialize Binary: {0} bytes",
    SerializeBinary(person, serializer).Length);
  Console.WriteLine("Serialize Binary with Dictionary: {0} bytes",
    SerializeBinaryWithDictionary(person, serializer).Length);
}

The application emits the size of the streams when each object is written out. The first, SerializeBinary, does not use a dictionary. As a result, it won’t have access to the aliases and must instead write out the full object.

private static Stream SerializeBinary(Person person,
  DataContractSerializer serializer)
{
  var stream = new MemoryStream();
  var writer = XmlDictionaryWriter.CreateBinaryWriter(stream);
  serializer.WriteObject(writer, person);
  writer.Flush();
  return stream;
}

In this case, we get a stream which contains 146 bytes. That’s pretty poor considering that we are interested in 10 characters (28 bytes: each string has a 4 byte length and then 2 bytes/character) and a simple DateTime representation (4 bytes). Can we make this smaller? How close can we get to 32 bytes? The answer: really close!

The version of SerializeBinaryWithDictionary that I wrote is verbose: it contains a number of lines that show what is going on internally. Your own code may be as long, but would include the lines as debug output.Please note that you need to include a reference to the XMLSchema-instance namespace in your dictionary so that both the reader and writer agree on the value of this attribute.

private static Stream SerializeBinaryWithDictionary(Person person,
  DataContractSerializer serializer)
{
  var stream = new MemoryStream();
  var dictionary = new XmlDictionary();
  var session = new XmlBinaryWriterSession();
  var key = 0;
  session.TryAdd(dictionary.Add("FirstName"), out key);
  Console.WriteLine("Added FirstName with key: {0}", key);
  session.TryAdd(dictionary.Add("LastName"), out key);
  Console.WriteLine("Added LastName with key: {0}", key);
  session.TryAdd(dictionary.Add("Birthday"), out key);
  Console.WriteLine("Added Birthday with key: {0}", key);
  session.TryAdd(dictionary.Add("Person"), out key);
  Console.WriteLine("Added Person with key: {0}", key);
  session.TryAdd(dictionary.Add("http://www.friseton.com/Name/2010/06"),
    out key);
  Console.WriteLine("Added xmlns with key: {0}", key);
  session.TryAdd(dictionary.Add("http://www.w3.org/2001/XMLSchema-instance"),
    out key);
  Console.WriteLine("Added xmlns for xsi with key: {0}", key);

  var writer = XmlDictionaryWriter.CreateBinaryWriter(
    stream, dictionary, session);
  serializer.WriteObject(writer, person);
  writer.Flush();
  return stream;
}

The size difference is striking: we shave off 108 bytes by using the dictionary. We are getting close to the same size as the memory footprint of the object data! The cool bit: you can use this in your own code. The dictionary needs to be shared between the reader and writer sessions (there is a corresponding XmlBinaryReaderSession which can also be populated from the common dictionary via the deserialization process). For posterity, the output of the program is:

Serialize Binary: 146 bytes

Added FirstName with key: 0

Added LastName with key: 1

Added Birthday with key: 2

Added Person with key: 3

Added xmlns with key: 4

Added xmlns for xsi with key: 5

Serialize Binary with Dictionary: 38 bytes

A slightly different version that shows both reading and writing with a shared understanding of what the dictionary looks like follows:

private static Stream SerializeBinaryWithDictionary(Person person,
  DataContractSerializer serializer)
{
  var strings = new List<XmlDictionaryString>();
  var stream = new MemoryStream();
  var dictionary = new XmlDictionary();
  var session = new XmlBinaryWriterSession();
  var rdr = new XmlBinaryReaderSession();

  var key = 0;
  strings.Add(dictionary.Add("FirstName"));
  strings.Add(dictionary.Add("LastName"));
  strings.Add(dictionary.Add("Birthday"));
  strings.Add(dictionary.Add("Person"));
  strings.Add(dictionary.Add("http://www.friseton.com/Name/2010/06"));
  strings.Add(dictionary.Add("http://www.w3.org/2001/XMLSchema-instance"));
  Console.WriteLine("Added xmlns with key: {0}", key);

  var writer = XmlDictionaryWriter.CreateBinaryWriter(
    stream, dictionary, session);

  foreach (var val in strings)
  {
    if (session.TryAdd(val, out key))
    {
      rdr.Add(key, val.Value);
    }
  }
  serializer.WriteObject(writer, person);
  writer.Flush();
  stream.Position = 0;
  var reader = XmlDictionaryReader.CreateBinaryReader(stream, dictionary,
    XmlDictionaryReaderQuotas.Max, rdr);
  var per = serializer.ReadObject(reader) as Person;
  writer.Flush();
  return stream;
}

 

Looking at the above, we can also account for the missing 6 bytes in our serialization: the extra 6 bytes are names of the nodes.

Leave a comment

Speaking at Chicago Architects Group May 18

I’ll be speaking at the Chicago Architects Group on May 18 over at the ITA (next to Union Station in Chicago- corner of Adams and Wacker). My topic is Azure for Architects. In this talk, I go over how to look at and use Azure from a software architecture point of view. Unlike most Azure talks, this one has no code in it-just concepts. This isn’t the type of talk I normally give, but given the crowd, architecture and slides will work better than whiz bang demos.

The slides are here if you want them. I tend to use slides as guideposts when I present. Please don’t look at these slides as notes. 80% of the presentation is in what I say, not in what you can read. I’ll try to record the presentation as well and will put up the recording if the quality is good enough. There are still some seats open. Register at http://chicagoarchitectsgroup.eventbrite.com.

Leave a comment

Interesting Post on Handling Large Data Volumes

Over on the HighScalability blog, there is an interesting post on how Sify.com handles scaling the web site to 3900 requests per second on just 30 VMs (across 4 physical machines). In the Future section of the article, the notion of using Drools for cache invalidation really grabbed my attention. Drools is a rules engine that implements the Rete algorithm to resolve rules. The Rete algorithm emphasizes speed of evaluation over memory consumption. Rules engines that support forward chaining and inference will normally implement Rete in some form. BizTalk (and I would assume Windows Workflow Foundation) also use Rete.

It was the notion of using a rules engine that really grabbed my attention. One of the problems with cache invalidation is that the easy stuff to cache is just that, easy. No thought is required to cache the front page of your web site. But, if your website is “addictive” in any fashion (think Facebook, MySpace, Fidelity.com, Digg, etc.), the personalized data that each user gets is cacheable too. When looking at overall traffic patterns, the data is light on writes and heavy on reads. Individual pieces of data may appear on many pages in the application. When that data changes, you want to invalidate any cached values that use that information. Figuring out and maintaining how to list all the places consumes and cache friend status is tough, especially if the goal is to do so in a centralized fashion. However, if I can add rules that state “I watch Scott’s status. If that changes, invalidate this cache location.” then I can make an interesting system.

I’ve been in a number of .NET shops that seem to stay away from Workflow Foundation. I wonder if products like Windows Server AppFabric and the cache server might finally get folks to look at using Windows Workflow for the rules engine. At the moment, this seems like an idea worth pursuing, just to see how it works out in the end. I also wonder if one could use the rules to do in place updates to the cache, so that instead of invalidation, we get a newly valid copy.

As of now, this idea is up on my white board as something to dig into after I get some other work done. If you hit this idea sooner, please let me know your results (scott@scottseely.com)!

Leave a comment

Notes from Software Engineering Talk

I gave a talk at Milwaukee Area Technical College where my friend, Chuck Andersen, teaches a software engineering class. I promised the students to put up some interview study resources. This is the set of things I do to prepare for more in depth interviews so that I clear the algorithm questions when folks do a technical screen. I really hate the idea of being passed over because I haven’t thought about some undergrad algorithms in a few years, so I get these things back into the more recent memory parts of my brain.

My study resources are:

Programming Pearls by Jon Bentley: 256 pages of good review material

The Algorithm Design Manual by Steven Skiena: Amazon has the wrong page count on this one: 486 pages of great review material. Get the Kindle version-this appears to be out of print and valuable otherwise. I know I didn’t pay $200+ for this book.

Project Euler: Go through 1-2 of these per week, just to stay in shape.

Leave a comment

Friseton, LLC is Open for Business

My last day as someone’s employee was Friday, May 7. As of today, I have completely jumped into the world of the self-employed. My wife and I started a company named Friseton, LLC (yes, I married I developer!). What does Friseton, LLC (which is really just me and my wife) do? Well, I’m glad you asked.

We consult on distributed application architecture and development. I personally have worked on architecture for small applications with only a few computers to systems with thousands of cooperating computers. I have worked on architecture in both traditional enterprise applications as well as for one of the five most popular web sites on the planet (circa 2008/9).

We’ve also invested a lot of time into understanding and developing on Azure, Silverlight, and Windows Phone 7. As the firm grows beyond the first two founders, we expect to also invest time into release applications on Azure and Windows Phone.

If you are interested in discussing an opportunity, please feel free to contact me: scott.seely@friseton.com.

Leave a comment

Custom ChannelFactory Creation

Just the other day, Derik Whitaker ran into some issues setting up his ChannelFactory to handle large object graphs being returned to his clients (post is here). After some back and forth through email, we came up with a solution. Instead of use the default ChannelFactory<T>, we created a new class that inherits from ChannelFactory<T> and sets the DataContractSerializerBehavior to handle int.MaxValue objects in the graph.

The trick is to override the ChannelFactory<T>.OnOpening method. This method is called as the ChannelFactory is opened and allows a derived class to alter the behavior at the last minute. All OperationDescriptions have a DataContractSerializerOperationBehavior attached to them. What we want to do is pull out that behavior and set the MaxItemsInObjectGraph property to int.MaxValue so that it allows all content to be serialized in. Derik’s use case was valid-he owned the client and server and wanted to incur any penalty associated with reading ALL data. If you are in a similar situation and need to remove that safety net/throttle in your code, here is what you need. Note that the constructors aren’t interesting other than they preserve the signatures made available through ChannelFactory<T> and make them visible in my DerikChannelFactory<T>.

 

public class DerikChannelFactory<T> : ChannelFactory<T>
{
    public DerikChannelFactory(Binding binding) :
        base(binding) { }

    public DerikChannelFactory(ServiceEndpoint endpoint) :
        base(endpoint) { }

    public DerikChannelFactory(string endpointConfigurationName) :
        base(endpointConfigurationName) { }

    public DerikChannelFactory(Binding binding, EndpointAddress remoteAddress) :
        base(binding, remoteAddress) { }

    public DerikChannelFactory(Binding binding, string remoteAddress) :
        base(binding, remoteAddress) { }

    public DerikChannelFactory(string endpointConfigurationName,
        EndpointAddress remoteAddress) :
        base(endpointConfigurationName, remoteAddress) { }

    protected override void OnOpening()
    {
        foreach (var operation in Endpoint.Contract.Operations)
        {
            var behavior =
                operation.Behaviors.
                    Find<DataContractSerializerOperationBehavior>();
            if (behavior != null)
            {
                behavior.MaxItemsInObjectGraph = int.MaxValue;
            }
        }
        base.OnOpening();
    }
}

 

The OnOpening override is also a good place to inject behaviors or other items if you want to make sure that all ChannelFactory instances have the same setup without resorting to configuration or code for each instance.

Leave a comment