Authors: Jason Short, Tony Guidici, Konstantin Dotchkoff

 

Introduction

With the transition of IT resources and apps to the cloud, one very common requirement is about transferring data from an on-premises or from a client to a cloud environment.

In this blog post we are going to cover common scenarios for data upload to Windows Azure based on our experience from real-life Azure projects. To provide a better context, here are a couple of examples: "storage server" in Azure, file system replication to Azure, custom backup service, disaster recovery, archival storage in Azure, etc.

All of the example scenarios require a massive upload of data to Windows Azure blob storage. In a lot of cases existing components need to be modified to extend the solutions to a Cloud environment. Typical requirements for the upload include: upload as fast as possible, reliable and optimized for throughput.

Comparison of Architecture Patterns

For each of the presented patterns we will provide a short description of the pattern along with key advantages and disadvantages (especially in comparison to the other patterns in this article).

Before we cover the recommended patterns based on best practices from existing projects, we'll start with two scenarios (pattern 1 and pattern 2) we have found to be sub-optimal and with limitations in terms of scalability and performance.

Pattern 1 – Upload Using Receiver Agent (Not Recommended)

A high level overview of the pattern is shown in Figure 1:

Figure 1 – Data upload to VHD disks using receiver agent

Description:

A client component sends data to an agent running on Windows Azure Virtual Machine (VM) instance(s) that writes the data to .vhd disks attached to the VM. This pattern is the easiest way to move existing functionality from an on-premises solution to Azure without making any significant changes. However, by using this pattern you don't really take a full advantage of the benefits of a cloud environment.

Advantages:

  • Potential reuse of an existing on-premises component

Disadvantages:

  • Limited Scalability:
    A single VM has limits (based on the VM size) in terms of number of disks that can be attached, IOPS and network bandwidth. Scalability can be achieved only through adding more VMs which could be associated with technical challenges and increased cost for the solution.
  • Cost implications:
    At least two VM instances need to be up and running 24x7 for availability, which potentially increases the cost of the solution.
  • Concurrency management:
    Because you'll need to run at least 2 VM instances for high availability, you probably need to do one of the following in order to avoid concurrency issues during the data transfer:
    • Open a session and keep it open for the duration of the transfer
    • Use client affinity, so that every request from a client goes to the same compute instance
    • Implement an active/passive scenario where only one instance will process the requests and the other is a stand-by instance in case the first one fails.
  • Different runtime behavior than on-premises:
    If you are migrating an existing component, be aware that moving it to a VM running in Azure will introduce a very different runtime behavior in terms of latency, performance, and stability of the connections. You definitely will need to adjust your client component and make it more resilient to handle those differences.

Pattern 2 – Upload to Blob Using Receiver Agent (Not Recommended)

A high level overview of the pattern is shown in Figure 2:

Figure 2 – Data upload to blob using receiver agent

Description:

A client component sends data to an agent running on Azure worker role or VM instance(s) to receive data and write to Azure blob storage. This pattern provides a relatively easy way to move existing functionality from an on-premises solution to Azure without making significant changes. However, by using this pattern you don't really take a full advantage of the benefits of a cloud environment.

Advantages:

  • Potential reuse of existing code that has been used in on-premises environment in the past

Disadvantages:

  • Limited Scalability:
    Network throughput is limited based on the size of the VM or 1worker role - up to 800 Mbps. Because the VM will write to a blob the effective inbound bandwidth will be no more than 400 Mbps. Scalability can be achieved only through adding more VMs which could be associated with technical challenges and increased cost for the solution.
  • Cost implications:
    At least two Compute instances need to be up and running 24x7 for availability. This might increase the cost of the solution.
  • Concurrency management:
    Because you'll need to run at least two Compute instances for high availability (typically load-balanced), you probably need to do one of the following in order to avoid concurrency issues during the data transfer:
    • Open a session and keep it open for the duration of the transfer
    • Use client affinity, so that every request from a client goes to the same compute instance
    • Implement an active/passive scenario where only one instance will process the requests and the other is a stand-by instance in case the first one fails.
  • Different runtime behavior than on-premises:
    If you are migrating an existing code or component, be aware that moving it to a compute role/VM running in Azure will introduce a very different runtime behavior in terms of latency, performance and stability of the connection. You'll definitely will need to adjust your client component and make it more resilient to handle those differences.

Pattern 3 – Upload to Blob with Post-Processing (Basic Scenario)

A high level overview of the pattern is shown in Figure 3:

Figure 3 – Data upload to blob with post-processing (basic scenario)

Description:

A client writes directly to a blob in Windows Azure. A background post-processing to a final blob is done by a worker role.

Advantages:

  • High throughput to blob storage:
    The data is transferred to Azure as fast as possible
  • Lower cost solution
    Based on the specific solution requirements, post-processing can be performed delayed – there might be no need to have the worker role running 24x7 or having multiple instances of the role running.

Disadvantages:

  • Potential post-processing lag
  • Requires code changes for existing on-premises components because of new development paradigm (e.g. Azure blob storage, REST API)

Pattern 4 – Upload to Blob with Post-Processing

A high level overview of the pattern is shown in Figure 4:

Figure 4 – Data upload to blob with post-processing

Description:

A client writes data to blob and also writes a job message to a queue. Post-processing to a final blob is done by worker role instances. The detailed steps as shown in the graphic are:

  1. Client writes data to blob
  2. Client writes a job message to a queue (incl. reference to the blob)
  3. Worker role picks up job message from the queue
  4. Worker role reads data from blob
  5. Worker roles performs post-processing and writes the result to the final blob

In this case the client is responsible for transaction management (i.e. for steps 1 and 2).

Advantages:

  • High throughput to blob storage – the data is transferred to Azure as fast as possible
  • Allows to fan out worker roles for high speed post-processing based on workload
  • Lower cost solution – see previous pattern for description

Disadvantages:

  • Requires code changes for existing on-premises components because of new development paradigm (e.g. Azure blob storage, REST API)
  • Client needs to handle transactions

Pattern 5 – Upload Using Request/Response Communication

A high level overview of the pattern is shown in Figure 5:

Figure 5 – Data upload using request/response communication

Description:

A client writes data to blob and a job message to queue. Post-processing to a final blob is done by worker role instances and a response message is sent back to the client via another queue.

The detailed steps as shown in the graphic above are:

  1. Client writes data to blob
  2. Client writes a job message to a queue (incl. reference to the blob)
  3. Worker role picks up job message from the queue
  4. Worker role reads data from blob
  5. Worker role performs post-processing and writes the result to the final blob
  6. Operation response message is written to an outgoing queue
  7. Client reads the response message

As in the previous pattern, the client is responsible for transaction management (i.e. for steps 1 and 2).

Advantages

  • High throughput to blob storage – the data is transferred to Azure as fast as possible
  • Allows to fan out worker roles for high speed post-processing based on workload

Disadvantages

  • Potential code changes for existing on-premises components because of new development paradigm (e.g. Azure blob storage, REST API)
  • Client needs to handle transactions

Pattern 6 – Upload Using Communication with Checkpoints

A high level overview of the pattern is shown in Figure 6:

Figure 6 – Data upload using communication with checkpoints

Description:

This pattern is similar to the previous one, however the client writes a checkpoint to an Azure storage table every time after upload of a specific amount of data. Similar, a worker role (responsible for post-processing) writes messages to the 'out'-queue to indicate progress. The detailed steps for the client and the server are as follows:

Client:

  1. Client writes portion of the data to blob
  2. Client writes a checkpoint to a table storage
  3. (optional) Client writes message to 'in'-queue to indicate progress to consumer (server)
  4. Repeat steps 1-3 until all data is uploaded - however the checkpoint from step 2 will be updated every time with new status. If connection is lost or client crashes, then after restart the client can continue the upload from the last checkpoint (and doesn't need to reload everything from the beginning). When done:
  5. Client deletes the checkpoint from the table
  6. Client writes a final job message to in-queue

Server (worker role):

  1. (optional) Server reads progress message from queue and can start post-processing for the available portion of the data
  2. Server reads final job message from 'in'-queue and starts the post-processing (or finishes the processing in the case of partial post-processing based on progress messages)
  3. Server writes the result from the post-processing to the final blob
  4. (optional) Server writes message to 'out'-queue to indicate progress to client
  5. Server writes a final message to 'out'-queue to confirm processing has completed

Again, the client is responsible for transaction management.

Advantages

  • High throughput to blob storage – the data is transferred to Azure as fast as possible
  • After failure client can continue upload from the last checkpoint
  • Worker role can restart processing from the last checkpoint as well (in addition, it provides progress info through the queue)
  • Server can start post-processing based on checkpoints, before the entire blob has been updated

Disadvantages

  • See previous pattern

Pattern 7 – Upload Using Shared Access Control "Valet Key"

A high level overview of the pattern is shown in Figure 7:

Figure 7 – Upload using shared access control "Valet Key"

Description:

A client requests access to blob storage through a worker role and receives access information. Then, similar to pattern 2, the client writes data to blob and job message to a queue. The detailed steps are:

  1. Client sends access request to a worker role
  2. Worker role performs lookup to determine location of destination resource and permissions (shown as 'Config Store' in figure 7 – e.g. table storage)
  3. Worker role returns resource reference and Shared Access Signature (SAS) back to the client
  4. Client writes data to blob using the SAS
  5. Client writes job message to a queue
  6. Processing continues as in previous patterns having another worker role to perform the post-processing. This is not depicted in figure 7 in order to simplify the graphic – it is analogous to the previous patterns.

Also in this pattern the client is responsible for transaction management.

Advantages

  • Shared temporary access control
  • Dynamic permissions control and routing for multi-tenant environments
  • High throughput to blob storage – the data is transferred to Azure as fast as possible

Disadvantages

  • Potential code changes for existing on-premises components because of new development paradigm (e.g. Azure blob storage, REST API)
  • Client needs to handle transactions

Pattern 8 – Upload Using Scalable Receiver

A high level overview of the pattern is shown in Figure 8:

Figure 8 – Upload using scalable receiver

Description:

A client sends data to a receiver running on worker role; post-processing is done asynchronously by another set of worker role instances.

The detailed steps as shown in the graphic are:

  1. Client sends data to a worker role (using stateless messages or communication with session management)
  2. Worker role writes data to blob
  3. Worker role writes job message to a queue (incl. reference to the blob)
  4. Second worker role picks up job message from the queue
  5. Second worker role reads data from blob
  6. Second worker role performs post-processing and writes the result to the final blob

In this case the first worker role is responsible for transaction management (i.e. for steps 2 and 3).

Advantages

  • Receiver responsible for transaction management
  • Reduced need for client code changes
  • High speed post-processing

Disadvantages

  • Additional cost for running receiver worker role instances
  • Potential session management challenges or design using stateless messages

Summary

In this blog post we described several common scenarios for data upload to Windows Azure based on our experience from real-life Azure projects. We introduced each pattern and discussed the advantages and disadvantages in comparison. We hope this gives you some ideas and guidance on possible patterns and variations, which may apply to your solution's specific requirements.