In Windows Server 2008 R2 Failover Clustering, some changes have been made and new features added with regard to group placement and resource load balancing.  These changes make it easier to manage clustered Hyper-V virtual machines, as well as other clustered roles, so that they are hosted on the best nodes for high availability.  For certain virtual machines or cluster roles with lower priority, the administrator can keep them offline in some cases to reserve system resources for other virtual machines or roles with higher priority.  The first post in this series will look at two changes that have been implemented to improve group distribution and resource load balancing amongst nodes, so that each node is more likely to receive an equal number of groups after certain events (such as a move, cluster start, or failover).

 

 

Group Load Balancing

When a group is moved to a node, it should ideally go to the node with the fewest number of groups so that the number of groups is distributed evenly amongst the nodes.  This basic load balancing logic treats all groups equally, and does not consider the group states, number of resources in each group, or other factors that may affect actual load or CPU resources which it will consume.

This change applies during administrator-initiated untargeted group move, node failure, or on cluster cold start, where groups were moved to a random node before.

Only a “best-effort” attempt is made to balance groups on nodes, and results are affected by cluster states, administrator policies or actions.  For example:

1.  Groups with preferred owners set will move to their preferred owners first.  If a number of groups have preferred owners and there are not enough remaining groups to balance group placement, groups will not be balanced.

2.  When a number of groups failover or move with no other group policies set, they generally target the node with the fewest groups.  However, if the group count difference between any nodes differs by more than the number of groups being moved, there will not be enough groups being moved to balance this difference. Except for failover or user-initiated move, the cluster does not automatically move groups from nodes with a higher group count to nodes with a lower group count.     

3.  Multiple users moving different groups simultaneously could cause these groups to move to the same node having the lowest group count, causing that node to host a larger group count than expected.

 

System Center Virtual Machine Manager (SCVMM) 2008 and 2008 R2 also has advanced load-balancing functionality called “Intelligent Placement”.  For more information, please see http://blogs.technet.com/chengw/archive/2008/05/13/intelligent-placement-in-scvmm-2008.aspx.

 

 

Different Failover Sequence between Groups

In some cases, a group is moved according to its unique failover list, which is a specific permutation of all nodes.  Each group’s failover list is determined during group creation, and does not change during the lifetime of the group.  Groups generally have different failover lists from each other, but given enough number of groups for the number of nodes in cluster, some groups will have the same failover list.  This change is intended to reduce the likelihood of massive group failovers targeting the same node.

 This change applies during node or resource failures, to cases where groups were previously moved to the next node by node ID.  Preferred owner nodes, if defined, are still always in the front of the failover list.  Nodes that are down, paused, or not part of the cluster are ignored.

 

Assuming no other group or resource policies such as anti-affinity or possible owners are set (which automatically prevents a node from failing over or being hosted on specific nodes), overview of the changes to existing behavior are listed in the table below (more information about preferred owners, possible owners, and anti-affinity: http://blogs.msdn.com/clustering/archive/2008/10/14/9000092.aspx).  The scenarios listed under Windows Server 2008 refer to the scenarios described in the following KB article: http://support.microsoft.com/kb/299631.

 

Windows Server 2008

Windows Server 2008 R2

Scenario/Behavior

Groups with preferred owners

Groups without preferred owners

Groups with preferred owners

Groups without preferred owners

Node failure

Group is moved to the next node in its failover list, which includes its preferred owners first, then other nodes by ID (KB 299631, Scenario 1).

Group is moved to a random node (KB 299631, Scenario 2B).

Group is moved to the next node in its failover list, which first includes its preferred owners, then other nodes in a unique order for the group, as described change #2 above.

Group is moved to a node with the fewest groups, as described in change #1 above.

Resource failure-triggered group failover

Group is moved to the next node by node ID (KB 299631, Scenario 2A).

Group is moved to the next node in its unique failover list described in change #2 above.

Administrator-initiated group move

Group is moved to the first available preferred owner; or if no preferred owner node is available, to a random node (KB 299631, Scenario 3).

Group is moved to a random node (KB 299631, Scenario 4).

Group is moved to the first available preferred owner; or if no preferred owner node is available, to a node with the fewest groups, as described in change #1 above.

Group is moved to a node with the fewest groups, as described in change #1 above.

Cluster cold start

Group is initially placed on a random node.

Group is initially placed on a random node.

Group is initially placed on a preferred or default owner that joins within “ClusterGroupWaitDelay” time, default 30 seconds. (See information below about default owners and ClusterGroupWaitDelay)

Group is initially placed on a node such that groups are as balanced between nodes as possible, as described in change #1 above.

 

 

The second part of this blog will look at some of the new resource group management features in Windows Server 2008 R2 Failover Clustering.

 

Thanks,

Howard Sun
Software Development Engineer in Test
Clustering & High-Availability
Microsoft