Wed, 07 Sep 2011

Managing Concurrency in Windows Azure with Leases

Concurrency is a concept many developers struggle with, both in the world of multi-threaded applications and in distributed systems such as Windows Azure. Fortunately, Windows Azure provides a number of mechanisms to help developers deal with concurrency around storage:

  1. Blobs and tables use optimistic concurrency (via ETags) to ensure that two concurrent changes won’t clobber each other.
  2. Queues help manage concurrency by distributing messages such that (under normal circumstances), each message is processed by only one consumer.
  3. Blob leases allow a process to gain exclusive write access to a blob on a renewable basis. Windows Azure Drives use blob leases to ensure that only a single VM has a VHD mounted read/write.

These mechanisms generally work transparently around your access to storage. As an example, if you read a table entity and then update it, the update will fail if someone else changed the entity in the meantime.

Sometimes, however, an application needs to control concurrency around a different resource (one not in storage). For example, I see a somewhat regular stream of questions from customers about how to ensure that some initialization routine is performed only once. (Perhaps every time your application is deployed to Windows Azure, you need to download some initial data from a web service and load it into a table.)

Leases for concurrency control

Of the concurrency primitives available in Windows Azure, blob leases are the most general and can be used in much the same way that an object is used for locking in a multi-threaded application. A lease is the distributed equivalent of a lock. Locks are rarely, if ever, used in distributed systems, because when components or networks fail in a distributed system, it’s easy to leave the entire system in a deadlock situation. Leases alleviate that problem, since there’s a built-in timeout, after which resources will be accessible again.

Returning to our initialization example, we might create a blob named “initialize” and use a lease on that blob to ensure that only one Windows Azure role instance is performing the initialization.

This pattern of using leases for concurrency control is so powerful that I included a special class called AutoRenewLease in my storage extensions NuGet package (smarx.WazStorageExtensions). AutoRenewLease helps you write these parts of your code similarly to how you might use a lock block in C#. Here’s the basic usage:

using (var arl = new AutoRenewLease(leaseBlob))
{
    if (arl.HasLease)
    {
        // inside here, this instance has exclusive access
    }
} // lease is released here

AutoRenewLease will automatically create the blob if it doesn’t already exist. Note that unlike C#’s lock, this is a non-blocking operation. If the lease can’t be acquired (because the blob has already been leased), execution will continue (with a HasLease value of false).

As the name indicates, AutoRenewLease takes care of renewing the lease, so you can spend as much time inside the using block without worrying about the lease expiring. (Blob leases expire after 60 seconds. AutoRenewLease renews the lease every 40 seconds to be safe.)

Revisiting our initialization example once again, we might write a method like this to perform our initialization only once (and have all instances block until initialization is complete):

// blob.Exists has the side effect of calling blob.FetchAttributes, which populates the metadata collection
while (!blob.Exists() || blob.Metadata["progress"] != "done")
{
    using (var arl = new AutoRenewLease(blob))
    {
        if (arl.HasLease)
        {
            // do our initialization here, and then
            blob.Metadata["progress"] = "done";
            blob.SetMetadata(arl.leaseId);
        }
        else
        {
            Thread.Sleep(TimeSpan.FromSeconds(5));
        }
    }
}

Here, we’re using the blob for leasing purposes, but we’re also using its metadata to track progress. This gives us a way to have all the instances wait until initialization is actually complete before moving on.

I found this pattern useful enough that I made a convenience method for it and included that in my storage extensions too. Here’s its basic usage:

AutoRenewLease.DoOnce(blob, () =>
{
    // do initialization here
});

To make sure something is done on every new deployment, you might use the deployment ID as the blob name:

AutoRenewLease.DoOnce(container.GetBlobReference(RoleEnvironment.DeploymentId), () =>
{
    // do initialization here
});

Download

The classes and methods mentioned above are all part of smarx.WazStorageExtensions (NuGet package, GitHub repository).

Next Steps

Leases are incredibly useful for managing all sorts of concurrency challenges. In upcoming blog posts, I’ll cover using leases to help with task scheduling and general leader election. Stay tuned!