We are excited to introduce some changes to the Copy Blob API with 2012-02-12 version that allows you to copy blobs between storage accounts. This enables some interesting scenarios like:
NOTE: To allow cross-account copy, the destination storage account needs to have been created on or after June 7th 2012. This limitation is only for cross-account copy, as accounts created prior can still copy within the same account. If the account is created before June 7th 2012, a copy blob operation across accounts will fail with HTTP Status code 400 (Bad Request) and the storage error code will be “CopyAcrossAccountsNotSupported.”
In this blog, we will go over some of the changes that were made along with some of the best practices to use this API. We will also show some sample code on using the new Copy Blob APIs with SDK 1.7.1 which is available on GitHub.
To enable copying between accounts, we have made the following changes:
In versions prior to 2012-02-12, the source request header was specified as “/<account name>/<fully qualified blob name with container name and snapshot time if applicable >”. With 2012-02-12 version, we now require x-ms-copy-source to be specified as a URL. This is a versioned change, as specifying the old format with this new version will now fail with 400 (Bad Request). The new format allows users to specify a shared access signature or use a custom storage domain name. When specifying a source blob from a different account than the destination, the source blob must either be
A copy operation preserves the type of the blob: a block blob will be copied as a block blob and a page blob will be copied to the destination as a page blob. If the destination blob already exists, it will be overwritten. However, if the destination type (for an existing blob) does not match the source type, the operation fails with HTTP status code 400 (Bad Request).
Note: The source blob could even be a blob outside of Windows Azure, as long as it is publicly accessible or accessible via some form of a Signed URL. For source blobs outside of Windows Azure, they will be copied to block blobs.
Making copy asynchronous is a major change that greatly differs from previous versions. Previously, the Blob service returns a successful response back to the user only when the copy operation has completed. With version 2012-02-12, the Blob service will instead schedule the copy operation to be completed asynchronously: a success response only indicates that the copy operation has been successfully scheduled. As a consequence, a successful response from Copy Blob will now return HTTP status code 202 (Accepted) instead of 201 (Created).
A few important points:
Below we break down the key concepts of the new Copy Blob API.
Copy Blob Scheduling: when the Blob service receives a Copy Blob request, it will first ensure that the source exists and it can be accessed. If source does not exist or cannot be accessed, an HTTP status code 400 (Bad Request) is returned. If any source access conditions are provided, they will be validated too. If conditions do not match, then an HTTP status code 412 (Precondition Failed) error is returned. Once the source is validated, the service then validates any conditions provided for the destination blob (if it exists). If condition checks fail on destination blob, an HTTP status code 412 (Precondition Failed) is returned. If there is already a pending copy operation, then the service returns an HTTP status code 409 (Conflict). Once the validations are completed, the service then initializes the destination blob before scheduling the copy and then returns a success response to the user. If the source is a page blob, the service will create a page blob with the same length as the source blob but all the bytes are zeroed out. If the source blob is a block blob, the service will commit a zero length block blob for the pending copy blob operation. The service maintains a few copy specific properties during the copy operation to allow clients to poll the status and progress of their copy operations.
Copy Blob Response: when a copy blob operation returns success to the client, this indicates the Blob service has successfully scheduled the copy operation to be completed. Two new response headers are introduced:
Polling for Copy Blob properties: we now provide the following additional properties that allow users to track the progress of the copy, using Get Blob Properties, Get Blob, or List Blobs:
These properties can be monitored to track the progress of a copy operation that returns “pending” status. However, it is important to note that except for Put Page, Put Block and Lease Blob operations, any other write operation (i.e., Put Blob, Put Block List, Set Blob Metadata, Set Blob Properties) on the destination blob will remove the properties pertaining to the copy operation.
Asynchronous Copy Blob: for the cases where the Copy Blob response returns with x-ms-copy-status set to “pending”, the copy operation will complete asynchronously.
Copy Blob operations are retried on any intermittent failures such as network failures, server busy etc. but any failures are recorded in x-ms-copy-status-description which would let users know why the copy is still pending.
When the copy operation is pending, any writes to the destination blob is disallowed and the write operation will fail with HTTP status code 409 (Conflict). One would need to abort the copy before writing to the destination.
Data integrity during asynchronous copy: The Blob service will lock onto a version of the source blob by storing the source blob ETag at the time of copy. This is done to ensure that any source blob changes can be detected during the course of the copy operation. If the source blob changes during the copy, the ETag will no longer match its value at the start of the copy, causing the copy operation to fail.
Aborting the Copy Blob operation: To allow canceling a pending copy, we have introduced the Abort Copy Blob operation in the 2012-02-12 version of REST API. The Abort operation takes the copy-id returned by the Copy operation and will cancel the operation if it is in the “pending” state. An HTTP status code 409 (Conflict) is returned if the state is not pending or the copy-id does not match the pending copy. The blob’s metadata is retained but the content is zeroed out on a successful abort.
With asynchronous copy, copying blobs from one account to another is simply as follow:
Once all the blobs are queued for copy, the monitoring component can do the following:
Example: Here is a sample queuing of asynchronous copy. It will ignore snapshots and only copy base blobs. Error handling is excluded for brevity.
public static void CopyBlobs(
// get the SAS token to use for all blobs
string blobToken = srcContainer.GetSharedAccessSignature(
new SharedAccessBlobPolicy(), policyId);
var srcBlobList = srcContainer.ListBlobs(true, BlobListingDetails.None);
foreach (var src in srcBlobList)
var srcBlob = src as CloudBlob;
// Create appropriate destination blob type to match the source blob
if (srcBlob.Properties.BlobType == BlobType.BlockBlob)
destBlob = destContainer.GetBlockBlobReference(srcBlob.Name);
destBlob = destContainer.GetPageBlobReference(srcBlob.Name);
// copy using src blob as SAS
destBlob.StartCopyFromBlob(new Uri(srcBlob.Uri.AbsoluteUri + blobToken));
Example: Monitoring code without error handling for brevity. NOTE: This sample assumes that no one else would start a different copy operation on the same destination blob. If such assumption is not valid for your scenario, please see “How do I prevent someone else from starting a new copy operation to overwrite my successful copy?” below.
public static void MonitorCopy(CloudBlobContainer destContainer)
bool pendingCopy = true;
pendingCopy = false;
var destBlobList = destContainer.ListBlobs(
foreach (var dest in destBlobList)
var destBlob = dest as CloudBlob;
if (destBlob.CopyState.Status == CopyStatus.Aborted ||
destBlob.CopyState.Status == CopyStatus.Failed)
// Log the copy status description for diagnostics
// and restart copy
pendingCopy = true;
else if (destBlob.CopyState.Status == CopyStatus.Pending)
// We need to continue waiting for this pending copy
// However, let us log copy state for diagnostics
pendingCopy = true;
// else we completed this pending copy
In an asynchronous copy, once authorization is verified on source, the service locks to that version of the source by using the ETag value. If the source blob is modified when the copy operation is pending, the service will fail the copy operation with HTTP status code 412 (Precondition Failed). To ensure that source blob is not modified, the client can acquire and maintain a lease on the source blob. (See the Lease Blob REST API.)
With 2012-02-12 version, we have introduced the concept of lock (i.e. infinite lease) which makes it easy for a client to hold on to the lease. A good option is for the copy job to acquire an infinite lease on the source blob before issuing the copy operation. The monitor job can then break the lease when the copy completes.
Example: Sample code that acquires a lock (i.e. infinite lease) on source.
// Acquire infinite lease on source blob
// copy using source blob as SAS and with infinite lease id
string cid = destBlob.StartCopyFromBlob(
new Uri(srcBlob.Uri.AbsoluteUri + blobToken),
null /* source access condition */,
null /* destination access condition */,
null /* request options */);
During a pending copy, the blob service ensures that no client requests can write to the destination blob. The copy blob properties are maintained on the blob after a copy is completed (failed/aborted/successful). However, these copy properties are removed when any write command like Put Blob, Put Block List, Set Blob Metadata or Set Blob Properties are issued on the destination blob. The following operations will however retain the copy properties: Lease Blob, Put Page, and Put Block. Hence, a monitoring component which may require providing confirmation that a copy is completed will need these properties to be retained until it verifies the copy. To prevent any writes on destination blob once the copy is completed, the copy job should acquire an infinite lease on destination blob and provide that as destination access condition when starting the copy blob operation. The copy operation only allows infinite leases on the destination blob. This is because the service prevents any writes to the destination blob and any other granular lease would require client to issue Renew Lease on the destination blob. Acquiring a lease on destination blob requires the blob to exist and hence client would need to create an empty blob before the copy operation is issued. To terminate an infinite lease on a destination blob with pending copy operation, you would have to abort the copy operation before issuing the break request on the lease.
Weiping Zhang, Michael Roberson, Jai Haridas, Brad Calder
When is version 1.7.1 going to be released? I am looking to use the StartCopyFromBlob functionality but cannot find this on version 1.7.0.
I've tried downloading the source from git (azure-sdk-for-net.git) but the version in the assembly is 1.7.0 still? Has this functionality been developed but not yet released or pushed to the branch?
Thanks for any help,
@Jon, Are you looking at source @ github.com/.../sdk_1.7.1. github.com/.../AssemblyInfo.cs has 1.7.1.
At this point we have released the source so that developers can compile as part of their project and start using it. It is not part of SDK 1.7 release but we will have it as part of our next SDK release(but we do not have an ETA yet).
One of the issues with the copy blob is the destination blob cannot use a SAS connection string.
This means all my code can work with SAS, except if I want to copy a blob I have to connect to the raw account. Can we please not fix this? I mean if the SAS connection string is the same on the source and destination blob account, what is the issue? Even if they are not the same, with the new above features, as long as we have write permission why can we not do a copy blob?
@Wayne, we will consider this as a feature request for future.
Are you planning similar feature for copying tables? (or maybe there is something already).
I'm looking for ability to create backups of my tables in table storage, so that I can protect my data against accidental damage (best would be scheduled & incremental transfers to another storage account or blob).
Are there any plans to allow existing storage accounts to be upgraded if they were created before June 7th 2012?
@Matthew, We don't have any plans at this time, but we will take that has a feature request.
@Slawomir, thanks for the feature request. We will document it in the list of customer requests.
What happens to the destination during the process of copying? Does the copy complete atomically, so the reader will see a changed destination only all changes are complete? Or can a reader observe partial changes? Similarly, when a copy operation fails, does it fail without making any observable changes? And you mention the possibility of multiple copies or multiple writers to the same destination. Could you explain what happens? Does the copy lock the destination, or is there a possibility of lost writes in some orders?
@TanjB - when copy begins, the destination cannot be written to and only one copy operation is allowed for the destination at a time. When a copy begins, the destination is overwritten.
Page blobs - system will issue a PutBlob request with length = length of source blob and the copy does not complete atomically.
For block blob, the system commits a 0 length blob. It will then proceed to copy blocks and the blob is finally committed when the entire blob is copied. The best way to tell if the blob copy is completed and hence can be read is to poll for copy completion (rely on x-ms-copy-status and x-ms-copy-id response headers). Does this help?
You can find more details @ msdn.microsoft.com/.../dd894037.
Currently is there a way to copy part of a page blob to another page blob? It would be great if we could specify offset, and length of source and destination page blobs for the copy operation.
@Sri, This is currently not supported and we will take this as a feature request.
You could either do a full copy and modify the destination blob, or you would need to do range GET on the source blob and write as needed to the destination blob.
How do you generate the key to use in the command line?
Shouldn't you call FetchAttributes() to refresh CopyState data.
@Arthur, since we list blobs, the response from server contains the state and we need not invoke FetchAttributes again. FetchAttribute is required only if you have a reference using GetBlockBlobReference or GetPageBlobReference (since it does not issue any request to server) and hence the state is not updated.