Some days ago, a recent article from former Microsoft employee Michael Washam (http://michaelwasham.com) captured my attention:
Connecting Clouds – Creating a site-to-site Network with Amazon Web Services and Windows Azure
Wow! Today we cannot (yet! :-) ) have an Azure Virtual Network/VPN crossing more than one Azure datacenter, but we can have a Virtual Network/VPN spanning two different Cloud providers…. Awesome!
My mind immediately went to the possible implications of new high availability and disaster recovery scenarios, such as building a solution that is not tied to a single Cloud Provider: working with partners on several Azure projects, I heard this kind of request several times since they want to ensure at least Disaster Recovery (DR), maybe also High Availability (HA), can be achieved even if a single Cloud Provider will fail completely.
NOTE: OpenSwan is a complete IPsec implementation for Linux, for more information see this link: https://www.openswan.org/projects/openswan .
The overall configuration process is simple, but there are some caveats:
But wait a moment: Why I have to use OpenSwan and Linux in the Amazon VPC, since it’s not officially supported by Azure? You can use a Windows Server 2012 VM and its RRAS feature and that’s it! It’ officially supported as you can read in the link below:
IMPORTANT: At least at my knowledge, there is no way to make Windows Server 2012 RRAS highly-available, then also in this case the proposed solution is more suitable for DR purposes, not HA.
Ok, now that you know the whole story, which HA/DR scenarios we can build? Since I’m still a SQL Server guy, let me focus on SQL Server (in Azure IaaS VMs) for the purpose of simplicity.
The starting point is provided in the white-paper below, where you can find all the possible HA/DR scenarios, without considering what we are discussing in this blog post:
High Availability and Disaster Recovery for SQL Server in Windows Azure Virtual Machines
Specifically, I’m interested in using SQL Server 2012 AlwaysOn Availability Groups (AG) to implement a DR scenario between AMAZON and AZURE, like the one below:
Here are my considerations:
Now, what will happen in case of a complete AZURE or AMAZON failure?
In the scenario proposed in the picture, in case of a complete AMAZON failure, the AZURE side of the architecture will not be affected at all and SQL Server will remain up and available. Conversely, in case of a complete AZURE failure, Windows Cluster will not have the necessary quorum to remain online, then it will shut down and SQL Server will be not available: this is expected in a DR scenario, manual intervention will be required to force the AMAZON side survivor node to start and SQL Server AG to perform a forced failover (with potential data loss).
If you are interested in the recovery steps at the Windows Server Cluster and SQL 2012 AG, look at the white-paper below (section “Recovering from a Disaster”):
AlwaysOn Architecture Guide: Building a High Availability and Disaster Recovery Solution by Using Failover Cluster Instances and Availability Groups
That’s all folks…. I would like to know your opinion and eventually your experience implementing this kind of scenarios.