Understanding Quorum in a Failover Cluster

Understanding Quorum in a Failover Cluster

Rate This
  • Comments 15

Hi Cluster Fans,

This blog post will clarify planning considerations around quorum in a Failover Cluster and answer some of the most common questions we hear. 

The quorum configuration in a failover cluster determines the number of failures that the cluster can sustain while still remaining online.  If an additional failure occurs beyond this threshold, the cluster will stop running.  A common perception is that the reason why the cluster will stop running if too many failures occur is to prevent the remaining nodes from taking on too many workloads and having the hosts be overcommitted.  In fact, the cluster does not know your capacity limitations or whether you would be willing to take a performance hit in order to keep it online.  Rather quorum is design to handle the scenario when there is a problem with communication between sets of cluster nodes, so that two servers do not try to simultaneously host a resource group and write to the same disk at the same time.  This is known as a “split brain” and we want to prevent this to avoid any potential corruption to a disk my having two simultaneous group owners.  By having this concept of quorum, the cluster will force the cluster service to stop in one of the subsets of nodes to ensure that there is only one true owner of a particular resource group.  Once nodes which have been stopped can once again communicate with the main group of nodes, they will automatically rejoin the cluster and start their cluster service.

For more information about quorum in a cluster, visit: http://technet.microsoft.com/en-us/library/cc731739.aspx.

Voting Towards Quorum

Having ‘quorum’, or a majority of voters, is based on voting algorithm where more than half of the voters must be online and able to communicate with each other.  Because a given cluster has a specific set of nodes and a specific quorum configuration, the cluster will know how many "votes" constitutes a majority of votes, or quorum.  If the number of voters drop below the majority, the cluster service will stop on the nodes in that group.  These nodes will still listen for the presence of other nodes, in case another node appears again on the network, but the nodes will not begin to function as a cluster until the quorum exists again.

It is important to realize that the cluster requires more than half of the total votes to achieve quorum.  This is to avoid having a ‘tie’ in the number of votes in a partition, since majority will always mean that the other partition has less than half the votes.  In a 5-node cluster, 3 voters must be online; yet in a 4-node cluster, 3 voters must also be online to have majority.  Because of this logic, it is recommended to always have an odd number of total voters in the cluster.  This does not necessarily mean an odd number of nodes is needed since both a disk or a file share can contribute a vote, depending on the quorum model.

A voter can be:

  • A node
    • 1 Vote
    • Every node in the cluster has 1 vote
  • A “Disk Witness” or “File Share Witness”
    • 1 Vote
    • Either 1 Disk Witness or 1 File Share Witness may have a vote in the cluster, but not multiple disks, multiple file shares nor any combination of the two 

Quorum Types

There are four quorum types.  This information is also available here: http://technet.microsoft.com/en-us/library/cc731739.aspx#BKMK_choices.

Node Majority

This is the easiest quorum type to understand and is recommended for clusters with an odd number of nodes (3-nodes, 5-nodes, etc.).  In this configuration, every node has 1 vote, so there is an odd number of total votes in the cluster.  If there is a partition between two subsets of nodes, the subset with more than half the nodes will maintain quorum.  For example, if a 5-node cluster partitions into a 3-node subset and a 2-node subset, the 3-node subset will stay online and the 2-node subset will offline until it can reconnect with the other 3 nodes.

Node & Disk Majority

This quorum configuration is most commonly used since it works well with 2-node and 4-node clusters which are the most common deployments.  This configuration is used when there is an even number of nodes in the cluster.  In this configuration, every node gets 1 vote, and additionally 1 disk gets 1 vote, so there is generally an odd number of total votes. 

This disk is called the Disk Witness (sometimes referred to as the ‘quorum disk’) and is simply a small clustered disk which is in the Cluster Available Storage group.  This disk is highly-available and can failover between nodes.  It is considered part of the Cluster Core Resources group, however it is generally hidden from view in Failover Cluster Manager since it does not need to be interacted with.

Since there are an even number of nodes and 1 addition Disk Witness vote, in total there will be an odd number of votes.  If there is a partition between two subsets of nodes, the subset with more than half the votes will maintain quorum.  For example, if a 4-node cluster with a Disk Witness partitions into a 2-node subset and another 2-node subset, one of those subsets will also own the Disk Witness, so it will have 3 total votes and will stay online.  The 2-node subset will offline until it can reconnect with the other 3 voters.  This means that the cluster can lose communication with any two voters, whether they are 2 nodes, or 1 node and the Witness Disk.

Node & File Share Majority

This quorum configuration is usually used in multi-site clusters.  This configuration is used when there is an even number of nodes in the cluster, so it can be used interchangeably with the Node and Disk Majority quorum mode.  In this configuration every node gets 1 vote, and additionally 1 remote file share gets 1 vote. 

This file share is called the File Share Witness (FSW) and is simply a file share on any server in the same AD Forest which all the cluster nodes have access to.  One node in the cluster will place a lock on the file share to consider it the ‘owner’ of that file share, and another node will grab the lock if the original owning node fails.  On a standalone server, the file share by itself is not highly-available, however the file share can also put on a clustered file share on an independent cluster, making the FSW clustered and giving it the ability to fail over between nodes.  It is important that you do not put this vote on a node in the same cluster, nor within a VM on the same cluster, because losing that node would cause you to lose the FSW vote, causing two votes to be lost on a single failure.  A single file server can host multiple FSWs for multiple clusters.

Generally multi-site clusters have two sites with an equal number of nodes at each site, giving an even number of nodes.  By adding this additional vote at a 3rd site, there is an odd number of votes in the cluster, at very little expense compared to deploying a 3rd site with an active cluster node and writable DC.  This means that either site or the FSW can be lost and the cluster can still maintain quorum.  For example, in a multi-site cluster with 2 nodes at Site1, 2 nodes at Site2 and a FSW at Site3, there are 5 total votes.  If there is a partition between the sites, one of the nodes at a site will own the lock to the FSW, so that site will have 3 total votes and will stay online.  The 2-node site will offline until it can reconnect with the other 3 voters.

Legacy: Disk Only

Important: This quorum type is not recommended as it has a single point of failure.

The Disk Only quorum type was available in Windows Server 2003 and has been maintained for compatibility reasons, however it is strongly recommended to never use this mode unless directed by a storage vender.  In this mode, only the Disk Witness contains a vote and there are no other voters in the cluster.  This means that if the disk becomes unavailable, the entire cluster will offline, so this is considered a single point of failure.  However some customers choose to deploy this configuration to get a “last man standing” configuration where the cluster remain online, so long as any one node is still operational and can access the cluster disk.  However, with this deployment objective, it is important to consider whether that last remaining node can even handle the capacity of all the workloads that have moved to it from other nodes.

Default Quorum Selection

When the cluster is created using Failover Cluster Manager, Cluster.exe or PowerShell, the cluster will automatically select the best quorum type for you to simplify the deployment.  This choice is based on the number of nodes and available storage.  The logic is as follows:

  • Odd Number of Nodes – use Node Majority
    • Even Number of Nodes
      • Available Cluster Disks – use Node & Disk Majority
      • No Available Cluster Disk – use Node Majority

The cluster will never select Node and File Share Majority or Legacy: Disk Only.  The quorum type is still fully configurable by the admin if the default selections are not preferred.

Changing Quorum Types

Changing the quorum type is easy through Failover Cluster Manager.  Right-click on the name of the cluster, select More Actions…, then select Configure Cluster Quorum Settings… to launch the Configure Cluster Quorum Wizard.  From the wizard it is possible to configure all 4 quorum types, change the Disk Witness or File Share Witness.  The wizard will even tell you the number of failures that can be sustained based on your configuration.

For a step-by-step guide of configuring quorum, visit: http://technet.microsoft.com/en-us/library/cc733130.aspx.

Thanks!
Symon Perriman
Technical Evangelist
Private Cloud Technologies
Microsoft

Leave a Comment
  • Please add 5 and 7 and type the answer here:
  • Post
  • Hi, Can u shed some light on the new switch to start the Cluster /PQ and the nodeweight concept as put on KB 2494036

    Thanks

  • Hi, I would like to know how to migrate the Quorum disk? Meanwhile, downtime is essential for migration?

  • Hi Can i configure 2 FSWs in a cluster (from 3rd and 4th location). This is not to increase the votes but just to have high availability at share level. if the connection to the site3 lost site4 FSW provide a vote and if the connection to the site4 lost site3 FSW will provide the vote.

  • You can only configure 1 FSW per Cluster.

    Please look at the 'Node & File Share Majority' above to understand how the Quorum Votes are calculated.

    If you have your Cluster nodes up and running and you loose connectivity to File Share Witness (3rd site) then the cluster would continue to run provided you have enough number of Cluster Nodes up and running.

    Thanks,

    Amitabh

  • None of the MS documentation makes this clear to me. If I have a two-node sql cluster and it's set as node&disk, then if the disk goes offline, the cluster stays up because the nodes are still running? how is this supposed to work? if the disk is offline then most likely your data drive is too, in which case sql isn't going to run well. came somebody clear this up for me?

  • The quorum disk is a witness disk with hold extra copy of clus DB. This help the cluster availabbility. Normally this disk is not used for other purpose.  It doesn't mean the other disks will be offline /failed if if the quorum disk is failed. What/how many  disks  is in your cluster?

  • Excellent. Made by understanding about quorums so simple.

  • Hi Symon,

    How to re create  0.hive  file in windows 2008 cluster?

    I have got a 3 node cluster, and I noticed my Q:\cluster doesn't have 0.hive  file.

    Also I don't  see 0.hive getting backed up using TSM backup client.

    Is there a command to re create 0.hive file back in Q:\cluster in windows 2008

  • Two questions, I had two (Primary and Replica) Server 2012 R2 physical hyper-v hosts running VM replication. This was working perfectly...even did a planned and non-planned failover...awesome. I then proceeded to install Failover Clustering, added the two hosts to a cluster, validated them and created the cluster, so far so good. IT was end of the day so i stopped right there, basically stopped at the end of the Create Cluster Wizard. A day later i cannot see the cluster in Failover Clustering Manager, I try to connect to it and I get "The RPC Server is Unavailable - Exception from HRESULT: 0x800706BA" any ideas? Also i can only Remote Desktop into one of the two WMs at a time, the other loses connectivity, is this to avoid the split brain problems?

  • Hi, I'm a noob with this quorum and clustering concepts.

    Based on our current setup is, we have 3 hosts (Nodes) and a SAN in our cluster. The setup will involve Hyper V Virtualization. For the SAN, we defined a Witness as one of the LUNS and another one for VMStorage. We configured the cluster to use Node Majority since we have odd number of Nodes.

    I just want to clarify the concept stated for Node Majority.

    1. Does it mean that Node Majority only considers the Node number's vote?

    2. Does Node Majority still involves the Witness Disk or not?

    3. If it is, how come the setup did not ask where is our Witness Disk and also, what is its function in the setup?

      If not, where or how does it store the quorum configuration.

    Hoping for some enlightenment on this.

    Thanks in advance.

  • Is there any requirement for a FSW to reside on a Windows server?  Can I use a Netapp SMB share instead, as long as the permissions are correct?

  • Braden, We support FSW only on Windows Servers i.e. that is the only scenario we test internally. However, there are 3rd party FSWs (including on NetApp SMB shares) that have been known to work just that we do not test it internally.

  • If you have a 4-node cluster with Node and Disk Majority, what happens if you lose two of your nodes simultaneously and one of them is the "owner" of the witness disk? The actual storage underpinning the witness disk is still online, so it could theoretically change owner OK. However, it seems to me there could potentially be race condition as the witness disk tries to find a new owner in a cluster that would not otherwise have quorum? Or does the "owner" not matter that much?

  • Doesn't matter... the surviving nodes will arbitrate and win the witness and the cluster will stay up.  All goodness

  • @Elden will those remaining nodes not already be in the process of shutting down because quorum has been lost? See what I mean about a race condition, or has this case specifically been designed for so they will wait to see if they can grab the witness disk before they decide to stop participating due to quorum issues?

Page 1 of 1 (15 items)