Hyper-V Program Manager
Power outages are not infrequent where I live (something I find quite confounding – to be honest) and earlier this week we had an extended power outage and my Hyper-V servers were powered off uncleanly. When the power returned I had to sit down and make sure that everything came back correctly.
At first glance – everything looked good. My Hyper-V servers powered up happily and started up all the virtual machines. Hyper-V Replica reported that it was in a critical state – but it automatically scheduled resynchronization for all of my virtual machines. But as I was going through the virtual machines – I found a problem.
Something had gone wrong with my firewall.
I could not figure out what exactly was wrong – but it was using 100% CPU and not allowing any network traffic through. I shutdown the virtual machine cleanly and restarted it – but no dice. It still would not work. Thankfully, there was a simple solution.
I shutdown the misbehaving virtual machine and started up the replica version of it. This came up with no problems and started functioning correctly – as it had not been powered off uncleanly, Yay!
Now, there is a key point to make here: if I had performed a planned failover in Hyper-V (select the primary virtual machine and perform a planned failover) this would not have worked. Hyper-V would have copied across the outstanding (bad) changes and would have broken my replica virtual machine too. What I actually did was go straight to the replica virtual machine and selected to perform a failover (not planned). By doing this, Hyper-V did not copy across the latest data and everything worked.
At the end of this process I reversed the replication relationship and was good to go.
Quick question: Is there any way to dedicate a NIC for Hyper-V Replication on Windows 2012 R2 (tested/validated by Microsoft)? I've seen quite a few 3rd party blogs talking about how to do it using HOSTS file, but I don't like the idea since it can be hard to troubleshoot network issues if you don't know how the environment was setup (tech support folks will understand that) and it is also susceptible to human failure (configuring HOST files properly in one node, missing the other one, for example - never seen that before, right!?). :-)