We get questions about how to estimate how much Windows Azure Storage will cost in order to understand how to best build a cost effective application. In this post we will give some insight into this question for the three types of storage costs – Bandwidth, Transactions and Capacity.
When using Windows Azure Blobs, Tables and Queues storage costs can consist of:
Note, the content in this posting is subject to change as we add more features to the storage system. This posting is given as a guideline to allow services to be able to estimate their storage bandwidth, transactions and capacity usage before they go to production with their application.
Pricing details can be found here.
The following gives an overview of billing for these three areas:
In the rest of this post we will explain how to understand these three areas for your application.
In order to access Blobs, Tables and Queues, you first need to create a storage account by going to the Windows Azure Developer Portal. When creating the storage account you can specify what location to store your storage account in. The six locations we currently offer are:
All of the data for that storage account will be stored and accessed from the location chosen. Some applications choose a location that is geographically closest to their client base, if possible, to potentially reduce latency to those customers. A key aspect here is that you will want to also choose in the Developer Portal the same location for your hosted service as the storage account that the service needs to access. The reason for this is that the bandwidth for the data transferred within the same location is free. In contrast, when transferring data in or out of the assigned location for the storage account, the bandwidth charges listed at the start of this post will accrue.
Now it is important to note that bandwidth is free within the same location for access, but the transactions are not free. Every single access to the storage system is counted as a single transaction towards billing. In addition, bandwidth is only charged for transactions that are considered to be billable transactions as defined later in this posting.
Another concept to touch on in terms of bandwidth is when you use blobs with the Windows Azure Content Delivery Network (CDN). If the blob is not found in the CDN (or the blob’s time-to-live (TTL) has expired) it is read from the storage account (origin) to cache it. When this occurs, the bandwidth consumed to cache the blob (transfer it from the origin to the CDN) is charged against the storage account (as well as a single transaction). This emphasizes that you should use a CDN for blobs that are referenced enough to get cache hits, before they expire in the cache due to the TTL, to offset the additional time and cost of transferring the blob from your storage account to the CDN.
Here are a few examples:
The first area we would want to cover for transactions is what equals 1 transaction to Windows Azure Storage. Each and every REST call to Windows Azure Blobs, Tables and Queues counts as 1 transaction (whether that transaction is counted towards billing is determined by the billing classification discussed later in this posting). The REST calls are detailed here:
Each one of the above REST calls counts as 1 transaction. This includes the following types of requests:
Both of these types of batch operations result in a single REST request to the storage service. Therefore, they count as a single transaction for each request.
When using the Storage Client Library, there are a few function calls that can result in multiple REST requests to your storage account.
Now that we understand what a transaction is, let’s describe what transactions are counted towards billing and what transactions are not counted.
When a transaction reaches our service, if it falls into one of the following classifications we do not count it towards billable transactions, and no bandwidth is charged for these transactions:
If any of the above conditions apply then the transaction is not counted towards billable transactions and no bandwidth is charged for the request. Then for the rest of the transactions we classify them as billable and they may or may not incur bandwidth charges as described in the bandwidth section.
We categorize the billable transactions into the following buckets:
Customers have asked to understand more details about the cost of storing Blobs, Tables and Queues in order to estimate the amount of storage capacity used by their application before running it in production.
The first thing to understand is how the monthly bill is accumulated for storage capacity. The storage capacity is calculated and aggregated at least once a day, and then averaged over the whole month to arrive at a GB/month charge. For example, if you used 10 GB the first half of the month and 0 GB the second half of the month, the monthly charge would be 5 GB.
The following describes how the storage capacity is calculated for Blobs, Tables and Queues. In the below Len(X) means the number of characters in the string.
Blob Containers – The following is how to estimate the amount of storage consumed per blob container:
48 bytes + Len(ContainerName) * 2 bytes + For-Each Metadata[3 bytes + Len(MetadataName) + Len(Value)] + For-Each Signed Identifier[512 bytes]
The following is the breakdown:
Blobs – The following is how to estimate the amount of storage consumed per blob:
For Block Blob (base blob or snapshot) we have: 124 bytes + Len(BlobName) * 2 bytes + For-Each Metadata[3 bytes + Len(MetadataName) + Len(Value)] + 8 bytes + number of committed and uncommitted blocks * Block ID Size in bytes + SizeInBytes(data in unique committed data blocks stored) + SizeInBytes(data in uncommitted data blocks)
For Page Blob (base blob or snapshot) we have: 124 bytes + Len(BlobName) * 2 bytes + For-Each Metadata[3 bytes + Len(MetadataName) + Len(Value)] + number of nonconsecutive page ranges with data * 12 bytes + SizeInBytes(data in unique pages stored)
Tables – The following is how to estimate the amount of storage consumed per Table:
12 bytes + Len(TableName) * 2 bytes
Entities – The following is how to estimate the amount of storage consumed per entity:
4 bytes + Len (PartitionKey + RowKey) * 2 bytes + For-Each Property(8 bytes + Len(Property Name) * 2 bytes + Sizeof(.Net Property Type))
The Sizeof(.Net Property Type) for the different types is:
Queues – The following is how to estimate the amount of storage consumed per Queue:
24 bytes + Len(QueueName) * 2 + For-Each Metadata(4 bytes + Len(QueueName) * 2 bytes + Len(Value) * 2 bytes)
Messages – The following is how to estimate the amount of storage consumed per message:
12 bytes + Len(Message)
Brad Calder
This is exactly the information I was looking for to help me estimate my monthly costs. Thanks for providing so much detail.
Assuming that I have a storage account, which has many containers. Each container is assigned to a member in my group. Is there a way to break down the bandwidth and transaction costs for each container? Thanks in advance for any tips/pointers
@Zack.Perry. There is no way to get per container statistics from the storage service. We are looking at feature requests to provide more information for storage.
For now, the way some services track information like this is that they send all of their storage requests through their own Windows Azure hosted service, which allows them to track whatever stats they want.
@Brad Calder (MSFT). Thanks for the info. Is there something similar to AWS S3 Server Access Logging: <docs.amazonwebservices.com/.../index.html
If it's available, we might be able to put together something quick. Thanks!
This is great info and should be in the official package deals. One question though:
I do not understand this:
"Table query doing a single storage request to return 500 entities (with no continuation tokens encountered) = 1 transaction"
after reading this:
"Entity Group Transactions – the ability to perform an atomic transaction across up to 100 entities with any combination of Insert, Update, or Delete for Azure Tables. The requirement is that all of the entities have to be in the same table and have the same PartitionKey value and the total request size must be under 4Mbytes."
I expected the query to cost 5 transactions.
What am I doing wrong?
@Zack
We do not have that feature, but it is an good feature request.
@Erno
A Query (GET) can return up to 1000 entities in a single request, which is why it is only 1 transaction. If it was a Entity Group Transaction (PUT) then that request can hold up to only 100 entities, and in doing that in one request would result in 1 transaction.
Thanks
Brad
Are the "behind the scenes" transactions that take place? I understand what you are saying above and have implemented batch options where applicable, but I still get a transaction counts orders of magnitude higher than I predict. I have increased the time spans in my queues and that has help, but doesn't explain everything.
My latest hairbrained theory has to do with the fact that during development on my local machine, I have had to use my Staging storage account. Are there transactions during debugging that Visual Studio sends to storage?
@Jason - The service only charges each transaction to the service once. The way to potentially track this down is to log/trace the requests from your application and see when and how many requests are being sent.
I would like to use the Page blob API, but don't know in advance the size of data.
Hence the following scenario:
1. make a Put Blob request for Page blob with x-ms-blob-content-length equal to 1TB
2. call Put Page for all chunks of data I would need to upload. For example, say that all data is 500GB.
3. call Put Page with 'x-ms-page-write: clear' and Range indicating from 500GB, 1TB
With this scenario I would be billed
- for 1TB for the duration between step 1 and 3
- for 500GB afterwards
Is this correct ?
@gillouxg
Note when you create a Page Blob you specify the target size, but you are not charged for that amount. You are only charged for the pages you have stored (which haven’t been cleared). So if you set the size of a Page Blob to be 1TB when you create it, but haven’t uploaded any data, then the billable size of the pages is 0 bytes.
Now in your example in step 1, since you uploaded 1 TB of data, then yes your analysis is correct. If instead, in step 1 you only created the Page Blob to have a size of 1TB, but haven’t put anything in it, then the billable size after that would be 0 bytes for the pages. Then after you put in 500GBs, the billable size for pages would be 500GBs.
@Brad. Thanks.
Your answer made things clearer. However from what you said there seems to be no restriction in always creating Page Blob with a target size of 1TB and then uploading data as needed (which would make the x-ms-blob-content-length header look somehow useless), right ?
Correct, there is no restriction to setting it to the max size from the start. It can still be useful for some applications/scenarios that want to use it to track the size of the Page Blob they intend to store at that point in time, and/or use it to bounds check the writes/reads ranges to the Page Blob. But if you don’t need to use it in that manner, then you can just set it to the largest size you will need.
Very useful info.
If we have empty queue or container then in that case it will be billed as
24 bytes + Len(QueueName) * 2 +
For-Each Metadata(4 bytes + Len(QueueName) * 2 bytes + Len(Value) * 2 bytes)
and
48 bytes + Len(ContainerName) * 2 bytes +
For-Each Metadata[3 bytes + Len(MetadataName) + Len(Value)] +
For-Each Signed Identifier[512 bytes]
respectively. Please correct me if i am wrong...
@Trupti Sarang, that is correct. These formulas here are provided to help get an approximate capacity.
Thanks,
Jai
"The first thing to understand is how the monthly bill is accumulated for storage capacity. The storage capacity is calculated and aggregated at least once a day, and then averaged over the whole month to arrive at a GB/month charge. For example, if you used 10 GB the first half of the month and 0 GB the second half of the month, the monthly charge would be 5 GB."
The above paragraph tell us how it is calculated by dividing it into 15 days each. How do you calculate capacity in a day, say if in a day we have 24 hours and during each hour my storage capacity will increase or decrease.