One of the frequent requests we receive is for a simple way to upload or download files between Windows Azure Blob Storage and their local file system. We’re pleased to release AzCopy (Beta Version), which is a command line utility which allows Windows Azure Storage customers to do just that. The utility is designed to simplify the task of transferring data in to and out of a Windows Azure Storage account. Customers can use this as a standalone tool or incorporate this utility in an existing application. The utility can be downloaded from Github.
The command is analogous to other Microsoft file copy utilities like robocopy that you may already be familiar with. Below is the syntax:
AzCopy <Source> <Destination> [filepattern [filepattern…]] [Options]
In this post we highlight AzCopy’s features without going into implementation details. The help command (AzCopy /?) lists and briefly describes all available system commands and parameters.
Example 1: Copy a directory of locally accessible files to blob storage container in a recursive mode.
AzCopy C:\blob-data https://myaccount.blob.core.windows.net/mycontainer/ /destkey:key /S
The above command will copy all files from the “c:\blob-data” directory and all subdirectories as block blobs to the container named “mycontainer” in storage account “myaccount”. “Blob-data” folder contains the following files and one subdirectory named “subfolder1”;
After the copy operation, “mycontainer” blob container will contain the following blobs:
If we do not use recursive mode (Copying without the “/S” option), the “mycontainer” blob container would only contain the following files under “blob-data” folder and would ignore files under the “subfolder1” folder.
Example 2: Recursively copy a set of blobs from a blob storage to a locally accessible directory in both verbose and recursive modes.
AzCopy https://myaccount.blob.core.windows.net/mycontainer c:\blob-data /sourceKey:key /S /V
The command will copy all blobs under the “mycontainer” blob container in account “myaccount” to the “c:\blob-data” directory in both verbose and recursive modes.
“mycontainer” blob container contains the following files:
Since we are using the verbose mode, the tool will display the following output which contains the file transfer status of each of the file in addition to the transfer summary. By default, the tool will only display the transfer summary:
Finished Transfer: car1.docx
Finished Transfer: car2.docx
Finished Transfer: car3.docx
Finished Transfer: train1.docx
Finished Transfer: subfolder1/car_sub1.docx
Finished Transfer: subfolder1/car_sub2.docx
Total files transferred: 6
Transfer successfully: 6
Transfer failed: 0
After the copy operation, c:\blob-data folder will contain the files listed below:
Let’s try a slightly different scenario by copying the blobs which start with “subfolder1\” by using the following command:
AzCopy https://myaccount.blob.core.windows.net/mycontainer/subfolder1 c:\blob-data /sourceKey:key /S /V
The above command will only copy blobs which begin with “subfolder1/”, and thus the tool will only copy “subfolder1/car_sub1.docx” and “subfolder1/car_sub2.docx” blobs to “c:\blob-data\” folder. After the copy operation, “C:\blob-data” will contain the following files:
Example 3: Copy a directory of locally accessible files to a blob account in re-startable mode
AzCopy c:\blob-data https://myaccount.blob.core.windows.net/mycontainer /destkey:key /Z:restart.log /S
Restart.log, a journal file, will be used to maintain a record of the status of the copy operation to allow the operation to restart if interrupted. If there is no text file specifies along with the re-startable mode parameter, the journal file will default to “azcopy.log” in the current working directory.
For instance, “C:\blob-data” folder contains the five large files with each of the file size greater than 100 MB.
When running with restart option, AzCopy allows you to restart the process in the case of failure. If the failure occurred while copying “car.docx”, AzCopy will resume the copy from the part of “car.docx” which has not been copied. If the copy occurred after “car.docx” was successfully copied, AzCopy will resume the copy operation from one of the remaining four files which have yet to be copied.
Example 4: Select number of files in a storage blob container using a file pattern and copy them to a locally accessible directory.
AzCopy https://myaccount.blob.core.windows.net/mycontainer c:\blob-data car /sourceKey:key /Z /S
“mycontainer” contains the following files:
After copy operation, “c:\blob-data” will contain the files listed below. Since the file pattern with the prefix of “car” was specified, the copy operation copies only the file with the prefix of “car”. Note that this prefix is applied to the blob, if it’s directly in the “mycontainer” container, or to the subdirectory name.
Within a Windows Azure datacenter (i.e., between a compute instance and a storage account within the same DC), users should be able to achieve 50MB/s upload and download speed for uploading and downloading large amounts of data using an extra-large compute instance. Transfers to and from a Windows Azure datacenter will be constrained by the bandwidth available to AzCopy.
Aung Oo Matthew Hendel
Windows Azure Storage Team
Nice. Any plans to support the following?
1. An option similar to robocopy /MIR which would also delete and not just add/update files at the destination.
2. File comparison by MD5 instead or in addition to file dates to reliably identify modified files.
3. GZIP for selected file types when uploading to Azure?
Thanks for your feature request. We will take them into consideration for future releases.
Guys, thanks! Could you maybe link those dlls into the main exe? Also where is the GitHub project URL so that I can star you?
Just found out that this won't work out of box if run on Server 2012, it demands .NET 3.5.
Also can we have a single exe other than 3 files?
Thanks for this great tool!
If Source and Dest both are containers, doesn't work.
Please also include a link for the source code... all I see is the binary. I want to understand why the HPC library is used/needed
We only share binary with this version and working towards sharing the code in later releases. The current version leverages HPC library to copy blobs to the blob storage account and thus it is needed.
What options are available for folks operating behind ISA Client. I'm receiving "The remote server returned an error: (407) Proxy Authentication Required. What method would I use to control proxy settings, such as...
<defaultProxy enabled="true" useDefaultCredentials="true">
<proxy usesystemdefault="True" />
@KevinTag: Please refer to this thread in the networking forum: social.msdn.microsoft.com/.../06a45fcf-56d9-4dda-a9b3-0d46addc357f. Thanks!
Is there any way to retain the modified date information when copying?
This is currently not supported, however we will take this as a feature request.
You can also use the Blob Transfer Utility to download and upload all your blob files.
It's a GUI tool to handle thousands of blob transfers in a effective way.
Binaries and source code, here: http://bit.ly/blobtransfer
I'm trying to upload data to a HDInsight cluster. Using azcopy was recommended but getting an error: "Error parsing destination location". I provide my cluster id found on my HDInsight home page. Can you provide an example for HDInsight use?
AzCopy supports upload data from local to Azure Storage, however HDInsight Cluster is not a valid destination for AzCopy yet. We will take this into consideration for later release.
-Jason TANG (Microsoft)
Can I set MIME image/jpg for uploading image files?