The four object abstractions Windows Azure Storage provides for application developers are:
The following shows the Windows Azure Storage abstractions and the Uris used for Blobs, Tables and Queues. In this post we will (a) go through each of these concepts, (b) describe how they are partitioned (c) and then talk about the scalability targets for these storage abstractions.
In order to access any of the storage abstractions you first need to create a storage account by going to the Windows Azure Developer Portal. When creating the storage account you can specify what location to place your storage account in. The six locations we currently offer are:
As a best practice, you should choose the same location for your storage account and your hosted services, which you can also do in the Developer Portal. This allows the computation to have high bandwidth and low latency to storage, and the bandwidth is free between computation and storage in the same location.
Then also shown in the above slide is the Uri used to access each data object, which is:
The first thing to notice is that the storage account name you registered in the Developer Portal is the first part of the hostname. This is used via DNS to direct your request to the location that holds all of the storage data for that storage account. Therefore, all of the requests to that storage account (inserts, updates, deletes, and gets) go to that location to access your data. Finally, notice in the above hostnames the keyword “blob”, “table” and “queue”. This directs your request to the appropriate Blob, Table and Queue service in that location. Note, since the Blob, Table and Queue are separate services, they each have their own namespace under the storage account. This means in the same storage account you can have a Blob Container, Table and Queue each called “music”.
Now that you have a storage account, you can store all of your blobs, entities and messages in that storage account. A storage account can hold up to 100TBs of data in it. There is no other storage capacity limit for a storage account. In particular, there is no limit on the number of Blob Containers, Blobs, Tables, Entities, Queues or Messages that can be stored in the account, other than they must all add up to be under 100TBs.
The figure below depicts the storage concepts of Windows Azure Blob, where we have a storage account called “cohowinery” and inside of this account we created a Blob Container called “images” and put two pictures in that blob container called “pic01.jpg” and “pic02.jpg”. We also created a second blob container called “videos” and stored a blob called “vid1.avi” there.
The above namespace is used to perform all access to Windows Azure Blobs. The URI for a specific blob is structured as follows:
http://<account>.blob.core.windows.net/<container>/<blobname>
The storage account name is specified as the first part of the hostname followed by the keyword “blob”. This sends the request to the part of Windows Azure Storage that handles blob requests. The host name is followed by the container name, followed by “/”, and then the blob name. Accounts and containers have naming restrictions, for example, the container name cannot contain a “/”.
There are two types of blobs supported:
The figure below depicts the storage concepts for Windows Azure Tables, where we have a storage account called “cohowinery” and inside of this account we created a Table called “customers” and put entities representing customers into that table, where the entities have properties like their “name”, “email”, etc. We also created a table called “winephotos” and the entities stored in that table contain properties of “PhotoID”, “Date”, etc.
The following summarizes the data model for Windows Azure Table:
The above namespace is used to perform all access to Windows Azure Table. The URI for a specific table access is structured as follows:
http://<account>.table.core.windows.net/<TableName>
The storage account name is specified as the first part of the hostname followed by the keyword “table”. This sends the request to the part of Windows Azure Storage that handles table requests. The host name is followed by the table name, and then the rest of the Uri will specify what entity is being operated on or the query string to be looked up.
The figure below depicts the storage concepts of Windows Azure Queues, where we have a storage account called “cohowinery” and inside of this account we created a Queue called “orders”, which stores the orders in messages that are waiting to be processed. The messages in the example contain the customer ID, order ID, a link to order details, etc.
The URI for a specific queue is structured as follows:
http://<account>.queue.core.windows.net/<QueueName>
The storage account name is specified as the first part of the hostname followed by the keyword “queue”. This sends the request to the part of Windows Azure Storage that handles queue requests. The host name is followed by the queue name.
A very important concept to understand about the storage abstractions is their partitioning. Every data object (Blobs, Table Entities and Queue Messages) has a partition key. This is how we locate the objects in our service when accessing them, and how we load balance and partition the objects across our servers to meet the traffic needs of those objects. The following is the partition key used for our three storage abstractions:
Our system automatically load balances these objects across our servers based upon these partitions. All objects with the same partition key value are grouped into the same partition and are accessed from the same partition server (see the upcoming post on Windows Azure Storage Architecture Overview). Grouping objects into partitions allow us to (a) easily perform atomic operations across objects in the same partition since their access goes to the same server, and (b) have caching locality of objects within the same partition to benefit from data access locality.
So what does this mean for Blobs, Entities and Messages with the above partition keys?
Now that we have given a high level description of storage accounts, storage abstractions and how they are grouped into partitions, we want to talk about the scalability targets for storage accounts, objects and their partitions.
The following are the scalability targets for a single storage account:
The 100TB is a strict limit for a storage account, whereas the transactions and bandwidth are the current targets we’ve built the system to for a single storage account. Note, the actual transaction and bandwidth achieved by your storage account will very much depend upon the size of objects, access patterns, and the type of workload your application exhibits. To go above these targets, a service should be built to use multiple storage accounts, and partition the blob containers, tables and queues and objects across those storage accounts. By default, a subscription gets 5 storage accounts, and you can contact customer support to get more storage accounts if you need to store more than that (e.g., petabytes) of data.
It is expected that a hosted service needs up to as many storage accounts to meet its performance targets given the above, which is typically a handful of storage accounts to up to 10s of storage accounts to store PBs of data. The point here is that a hosted service should not plan on creating a separate storage account for each of its customers. Instead, the hosted service should either represent a customer within a storage account (e.g., each customer could have its own Blob Container), or map/hash the customer’s data across the hosted service’s storage accounts.
Within a storage account, all of the objects are grouped into partitions as described above. Therefore, it is important to understand the performance targets of a single partition for our storage abstractions, which are:
The above throughputs are the high end targets for the current system. What can be achieved by your application very much depends upon the size of the objects being accessed, the operations (workload) and the access patterns. We encourage all services to test the performance at the partition level for their workload.
When your application reaches the limit to what a partition can handle for your workload, it will start to get back “503 server busy” responses. When this occurs, the application should use exponential backoff for retries. The exponential backoff allows the load on the partition to decrease, and to ease out spikes in traffic to the partition. If this is a regular occurrence, then the application should try to improve its data partitioning and throughput as follows for the different storage abstractions:
See the next set of upcoming posts on best practices for scaling the performance of Blobs, Tables and Queues.
Finally, one question we get is what the expected latency is for accessing small objects:
Brad Calder
500 transactions per second for the queue is really really low. Classical non-cloud systems are doing much better without even distributing the distributing the load (just a mirror for continuous backup).
I would expect at least 100x better scalability. Basically, Twitter should be able to run their short messages on Azure Queues. The situation is even more frustrating because increasing queue throughput through queue sharding is not *that* hard actually (and could be done on the cloud side actually).
@Joannes
Yes, that is a good point. What we recommend is that customers represent multiple work items per message or to use multiple queues to get higher throughput out of queues. Note, we have architected the system to support higher throughputs for queues over time, and we need to see where exposing that rates in terms of customer priorities. Please rate the importance here relative to the other ideas: http://www.mygreatwindowsazureidea.com/forums/34192-windows-azure-feature-voting
Thanks again for the feedback.
Brad
For the transaction and bandwidth account scalability targets, as well as the latency during load balancing, can you quantify "a few"? Some people define it as 2-3, others say 3-5, etc.
Great post, very informative.
One comment. The article talks a lot about upper limits on performance--up to 500 transactions per second, up to 60MB/sec, etc.
I'd be much more interested in lower limits--that's what has a business impact on my application.
I understand there are edge cases where delays can kick in, but these could easily be avoided e.g. by using percentiles.
I also understand that these figures might not form part of a binding SLA (but it would be great if they did). Even so, would it be possible to publish some target figures, or measured statistics?
For Table Storage, you say the throughput target for a single partition is up to 500 transactions per second. How do entity group transactions fit into that? I know for billing purposes an entity group transaction only counts once, but is that true for throughput as well, or will 100 entities in a single entity group transaction contribute 100 transactions towards the throughput target?
@Joe
- For transactions it is around 5,000 transactions per second for a single storage account.
- For bandwidth it is around 3 Gigabits per second for a single storage account.
- In terms of load balancing a partition, if I understood your question correctly, it is on the order of 10-15 seconds.
@Jonathan
It really depends upon the application, its access patterns, size of transactions, mix of transactions, etc. Our goal is to provide the scalability targets listed for as many workloads as possible, but applications need to benchmark their performance to understand what is it is. In general, applications should be able to achieve these scalability targets and consistently see those targets if they following the partitioning best practices we’ve gone over in the PDC talks, this blog, and technical papers.
@Zach
Sorry, should have made explained this better.
The 500 transactions per second equates to:
- 500 entities per second for a single Table partition. So if you are doing batch requests with 100 entities per batch request, then that equates to being able to do 5 batch requests per second with 100 entities in each batch request to a single Table partition.
- 500 messages per second for a single Queue
Thanks
Check some code here:
partitioncloudqueue.codeplex.com
for how to scale past 500 messages per second for a single Azure Queue