Backing up and Recovering the Failover Cluster Configuration Database

Backing up and Recovering the Failover Cluster Configuration Database

Rate This
  • Comments 7

For my first post here, I thought I’d talk about backing up and recovering the Failover Cluster configuration database.  For Windows Server 2008, Failover Clustering’s backup and restore code was revamped to fit into the Volume Shadow Copy Service (VSS) framework.  Now Failover Clusters can be backed up just like any other application that supports VSS.

To get us started, I created a cluster of two file servers.  I have a 2-node cluster with a node and disk majority quorum mode, so I can sustain the failure of one node or the quorum disk.  Let’s open up the cluster administrator and take a look at what we’ve got.  Take a look at Figure 1 to see this file share.

 

Figure 1: The first file share on my cluster.  Look at that heavy load of clients!

 

Now that we have set up our cluster, let’s keep backups of our cluster configuration to make sure that we don’t lose this setup.  I set up a backup schedule to backup all the critical volumes every 30 minutes, as you can see in Figure 2.  You will, of course, want to carefully think about the backup schedule of your server and volumes.

 

Figure 2: My scheduled backups.  It doesn't show it, but the Windows Server Backup utility includes the Clustering application by default.

Safety at last!  But you know what?  I think I want some of those disks in my other file share.  Let’s go and do that. 

 

Figure 3: Finally, a nice simple file share with only one disk.

 

Ok, so I’ve moved the disks out the other file server, and all is well.  The next day rolls in and I see an email from my manager.  Uh-oh, maybe I should have talked to someone about that change.  No matter, I can fix this with just a few minutes of downtime.  Let’s open an elevated command window (Start -> Right Click on Command, Select Run as Administrator).  At the prompt, type wbadmin get versions.  This will show you all the backups on this machine.

Here’s the output on mine:

 

C:\Windows\system32>wbadmin get versions

wbadmin 1.0 - Backup command-line tool

(C) Copyright 2004 Microsoft Corp.

 

Backup time: 12/31/2007 12:18 PM

Backup target: Network Share labeled \\mattkur-stor\ClusterBackups

Version identifier: 12/31/2007-20:18

Can Recover: Volume(s), File(s), Application(s), Bare Metal Recovery, System State

 

Backup time: 12/31/2007 3:30 PM

Backup target: Fixed Disk labeled mattkur 2007_12_31 15:11 DISK_01(\\?\Volume{e028e25b-b000-11dc-8ea1-0011114b1b2e})

Version identifier: 12/31/2007-23:30

Can Recover: Volume(s), File(s), Application(s), Bare Metal Recovery, System State

 

Backup time: 12/31/2007 4:00 PM

Backup target: Fixed Disk labeled mattkur 2007_12_31 15:11 DISK_01(\\?\Volume{e028e25b-b000-11dc-8ea1-0011114b1b2e})

Version identifier: 01/01/2008-00:00

Can Recover: Volume(s), File(s), Application(s), Bare Metal Recovery, System State

 

Figure 4: The backup versions available to me at this time.  Note the version identifier, as this is the key string we need to identify this version in future commands.

 

The 4PM backup happened just before I made the change, so let’s use that one.  Take note of the string after “Version Identifier: “.  This string is what we pass in to the follow commands as the parameter “-version:XXXXXX” to refer to the specific backup that we made.  To make sure that we can restore cluster data, let’s take a look at what was backed up.  For this, use the command “wbadmin get items”:

 

 

C:\Windows\system32>wbadmin get items -version:01/01/2008-00:00

wbadmin 1.0 - Backup command-line tool

(C) Copyright 2004 Microsoft Corp.

 

Volume Id = {33e841b0-affa-11dc-baba-806e6f6e6963}

Volume '<Unlabeled Volume>', mounted at D:

 

Volume Id = {33e841b2-affa-11dc-baba-806e6f6e6963}

Volume 'Longhorn', mounted at C:

 

Application = Cluster

 

Component = Cluster Database (\Cluster Database)

 

Application = Registry

 

Component = Registry (\Registry)

 

Figure 5: The items included in that backup.  Note the Cluster application - this is what we're looking for!

 

Now let’s restore the cluster.  We always advise that an administrator take all applications in the cluster offline prior to recovering the cluster configuration.  While we’re at the command line, enter “cluster group <group-name> /off”, for each application name (to see a list of the application names, just run the command “cluster group”).  This will take care of the applications.  To start the recovery, use the “wbadmin start recovery” command.  I specify that I want to perform an Application level recovery (-itemType:App) and restore the Cluster application (-items:Cluster).  Again, here is my output:

 

C:\Windows\system32>wbadmin start recovery -itemType:App -items:Cluster -version:01/01/2008-00:00

wbadmin 1.0 - Backup command-line tool

(C) Copyright 2004 Microsoft Corp.

 

You have chosen to restore the application Cluster.

The following components will be restored.

Component = Cluster Database (\Cluster Database)

 

 

WARNING:  This operation will perform an authoritative restore of your cluster. After restoring the cluster database, the Cluster service will be stopped and then started, which may take a few minutes. Please be patient.

 

Do you want to continue with an authoritative restore of your

cluster?

[Y] Yes [N] No y

 

Preparing the component Cluster Database for restore.

Restoring the files for the component Cluster Database, copied (100%).

Restoring the component Cluster Database.

Restored the component Cluster Database successfully.

 

Recovery operation completed.

 

 

Log of files successfully restored

'C:\Windows\Logs\WindowsServerBackup\ApplicationRestore 31-12-2007 17-25-08.log'

 

 

 

Summary of recovery:

--------------------

 

Restored the component Cluster Database successfully.

 

NOTE: In order to COMPLETE the restoration of cluster associated with this

node,

1.  the cluster service must be started on this node.

2.  After that, cluster service needs to be started on the nodes identified in the restored cluster database. To see the list of nodes, type the following command in a command window:

        Command:: cluster.exe node

 

Figure 6: I did it!  I restored my cluster configuration and this node was restarted.

 

I fixed the problem!  The cluster is back into the same state as before I made the changes.  Reconnect the cluster administrator and take a look at our handiwork:

 

 

Figure 7: Uh-oh, one of our nodes is still down.

 

If you take a look at the nodes, notice that one of my nodes is down.  During the restore process, the cluster is taken completely offline.  The cluster configuration database is recovered from the backup store on one node (the node that we just ran our recovery from).  To ensure that this is the copy of the cluster configuration that the cluster uses, this node must be started first.  Wbadmin is kind enough to do this for us, but we need to start the other nodes in the cluster.  Do that and we’re 100% operational.

 

-Matt Kurjanowicz

 

Leave a Comment
  • Please add 2 and 2 and type the answer here:
  • Post
Page 1 of 1 (7 items)