Problem:

 

Monitoring of clustered virtual machines (guests) is unreliable with Operations Manager 2007. Instances of Virtual Machine are not monitored without any apparent reason.

Scenario:

 

Having simple wolfpack cluster (implemented thru Microsoft Cluster Services – MSCS) with just a quorum cluster resource group and clustered Virtual Server 2005 R2 (following this guide). Operations Manager 2007 is installed and health service is pushed to every cluster node. Agents are also pushed to every virtual machine (and those computers become agent managed computers). At the end, Server Virtualization Management Pack for Microsoft System Center Operations Manager 2007 is imported after successful installation and deployment of OpsMgr.

 

Following is the list of issues one can observe with such setup:

 

1.       Instance of virtual machine (guest) or instance representing windows computer is NEVER monitored when virtual machine cluster resource group is active on the same cluster node as quorum resource group

2.       Instance of virtual machine (guest) or instance representing windows computer is monitored fine when its adequate cluster resource group is not active on the same cluster node as quorum resource group

3.       Everything works as expected when Server Virtualization MP is not present!

 

Root cause:

 

Based on investigation, all instances that are to be monitored by health service running on the virtual machine agent (guest) are disabled. It appears that Server Virtualization MP has not been designed to work under cluster scenarios, mainly due to the fact of associating Virtual Machine Host with virtual computer (cluster).

 

Possible solution:

 

Let’s quickly review some MP implementation details. Main discovery targets Microsoft.Windows.Server.Computer (and cluster is discovered as instance of Microsoft.Windows.Cluster.VirtualServer where this type extends Windows Server type). This means that discovery executed on cluster nodes is discovering instances of the VMHost while their virtual machines are active on particular node, but it also means that discovery is executed against quorum cluster resource group (virtual computer).

 

As mentioned in “Root cause” section, this is, I believe, where MP author might make a mistake with health modeling, as in fact you do not need to associate quorum with virtual machine group as they are able to coexist independently. I base such comment on the fact that virtual machine resource group will stay online when quorum moves to different cluster node. It also remains fully functional and accessible thru VS Administration Website with name of the physical computer (cluster node) used.

 

Translating this to real life example, having VM1 on node1 and VM2 plus quorum on node2, following seems discovered:

 

·         Instance of VM host for VM1 associated with physical computer 1.

·         Instance of VM host for VM2 associated with physical computer 2.

·         Instance of VM host for VM2 associated with virtual computer (cluster = quorum)

 

Investigation revealed that avoiding an association of VM host with virtual computer will fix the issue. This can be done by having DiscoveryPropertyOverride to disable VM host discovery in the context of Microsoft.Windows.Cluster.VirtualServer. (Solution verified while investigating issue.)

 

<DiscoveryPropertyOverride ID="VirtualServer.2005R2.Discovery.Override" Context="Cluster!Microsoft.Windows.Cluster.VirtualServer" Enforced="false" Discovery="VirtualServer!Microsoft.Virtualization.VirtualServer.2005R2.DiscoveryRule" Property="Enabled">

  <Value>false</Value>

</DiscoveryPropertyOverride>

 

Implementing workaround:

 

Really should be as simple as importing attached MP. But I had an environment where following steps had to be taken:

 

1.       Open Operations Console

2.       Go to “Administration” -> ManagementPack

3.       Select “Microsoft Virtualization Reports” -> right click -> delete

4.       Select “Microsoft Virtual Server 2005 R2” -> right click -> delete. If another dependency exist, delete as well (please be careful if such dependency is default MP as diff steps needs to be taken first)

5.       Right click in “Administration” pane -> Import management packs

6.       Add Microsoft.Virtualization.Reports.mp

7.       Add Microsoft.Virtualization.VirtualServer.2005R2.mp

8.       Add attached MP

 

When all management packs are imported at the same time, override is properly applied and such allowed my test environment to work and monitor virtual machines

 

DISCLAIMER:

Please evaluate in your test environment first! As expected, this solution is provided AS-IS, with no warranties and confers no rights. Use is subject to the terms specified at Microsoft.