Follow me on Twitter...
Have a look at a my blogs on 'You Had Me At EHLO...' Protecting Exchange Data with DPM, CCR Decision Making Flowcharts, SCR Decision Making Flowcharts, On email archiving. ...and a few more here High item counts - what do you do about it?, Archive v Big Mailboxes, Getting your Exchange 2007 Project Approved, (and the follow up 7 blogs), Recovery Scenarios Part 1, Recovery Scenarios Part 2, Recovery Scenarios Part 3, How quick will DPM backup?, Synchronous or Asynchronous Replication?, Why not stretch CCR?, Backup solutions for Exchange 2007..., Do we actually need to backup Exchange?, SAN v DAS.
...and a few more here High item counts - what do you do about it?, Archive v Big Mailboxes, Getting your Exchange 2007 Project Approved, (and the follow up 7 blogs), Recovery Scenarios Part 1, Recovery Scenarios Part 2, Recovery Scenarios Part 3, How quick will DPM backup?, Synchronous or Asynchronous Replication?, Why not stretch CCR?, Backup solutions for Exchange 2007..., Do we actually need to backup Exchange?, SAN v DAS.
So as the TechNet article (Understanding Datacenter Activation Coordination Mode) explains you can’t enable DAC mode in Exchange Server 2010 for a DAG where all members are in the same AD site… So what happens if you lose your primary data centre where your Witness Server is located and you do have a single DAG spanning 2 data centres with all members in the same AD site?
First – work out if the loss is permanent. If it’s not it might be worth waiting until the data centre is back – that way you can probably avoid the risk of split brain since you can shutdown the remaining DAG members and wait for a managed recovery. If it is permanent then you have to do a bit of work – nothing that is going to take you too long but it’s not as simple as running a couple of PowerShell commandlets; and you have to consider what happens if you cannot manage the recovery of the lost DAG members – it is likely that you will have to do a full seed as opposed to an incremental reseed as at a minimum there is likely to be divergence which the store may not be able to recover from. The steps that worked for me in our test rig are as follows:
“DAC mode has been extended to support DAGs that have all members deployed in a single Active Directory site, including Active Directory sites that have been extended to multiple locations.” (http://blogs.technet.com/scottschnoll/archive/2010/04/10/new-high-availability-features-in-exchange-2010-sp1.aspx)
“DAC mode has been extended to support DAGs that have all members deployed in a single Active Directory site, including Active Directory sites that have been extended to multiple locations.”
(http://blogs.technet.com/scottschnoll/archive/2010/04/10/new-high-availability-features-in-exchange-2010-sp1.aspx)
Take notice of the note that accompanies the blog though:
“But a quick note: everything in this post is based on pre-release software and preliminary information that is subject to change. These are things we are working on or are about to work on. The feature names, behaviors and descriptions used below might not be the final names, behaviors and descriptions. The behvaiors described may or may not make it into the final shipping version of SP1 or a future version of the product. Standard disclaimers apply regarding pre-Beta software and content.”
The other idea that was given to me to avoid split brain and the need for a full seed in the event of failback is to mark the databases not to mount at startup. This means that where you cannot manage the startup of the lost DAG members the databases will not mount. Unfortunately this prevents automated failover where there is data loss but where the lost logs are within AutoDatabaseMountDial since the replicas will not be activated and get left in a failed state. Nice idea but didn’t work in my testing..
Thanks for the nice article.
have a small doubt.
From what I read about database availability group,if DAG members are in same AD site then the failover to another database be automatic.
There should/would not be any manual intervention.
I am sorry if I am missing something.
Thanks
In the example I've used here the Witness Server is in the failed data centre and therefore no automatic recovery because the remaining nodes can not gain a majority and achieve quorum.
I have a query that might not be in context here. In a DAG, say 2 node, if the disk holding the active database gets corrupt beyond repair will the copy automatically stand in ? What happens internally in exchange ?