Tuesday, June 26, 2007 10:44 AM
Komal
AEM: Changing Threshold Values (via MP)
Various error groups and applications that report crashes over period of time can be seen in AEM Views.
The state of each error group is controlled by 3 thresholds: error/crash/hit count, unique users affected, unique computers affected. By default, the threshold value for all these 3 parameters is 50. In other words, if any error group has a error count and/or unique users affected count and/or unique computers affected count exceeding 50 then the health of corresponding error group goes red affecting it's availability aspect.
The state of each application is also controlled by 3 direct thresholds: error/crash/hit count, unique users affected, unique computers affected. But it is also affected by state of any of the error groups it hosts/encompasses. The threshold for the 3 basic thresholds is 50 here too.
These basic threshold values can be changed by submitting an instance of Microsoft.SystemCenter.CM.AEM.MonitorOverride for each error group/application whose threshold values need to be changed. This class type has 4 properties: ManagedEntityId and new value for the 3 thresholds. The ManagedEntityId can correspond to either ID of a watson bucket/application error group or ID of an application.
Attached is a sample/demo MP that shows how to change the threshold values. Let's analyze various parts of the MP.
|
<Discovery ID="ChangeMyAppThresholdValue" Target="AEMLib!Microsoft.SystemCenter.CM.AEM.CrashListener">
<Category>Discovery</Category>
<DiscoveryTypes>
<DiscoveryClass TypeID="AEMLib!Microsoft.SystemCenter.CM.AEM.MonitorOverride" />
</DiscoveryTypes>
<DataSource ID="DS" TypeID="ApplicationThresholdDefiner">
<!-- Properties for Application(s) whose threshold value needs to be changed -->
<ApplicationName>= 'Foo.exe'</ApplicationName>
<ApplicationVersion>like '%'</ApplicationVersion>
<HitCountThresholdValue><!-- the new value-->100</HitCountThresholdValue>
<UniqueUserThresholdValue><!-- the new value -->100</UniqueUserThresholdValue>
<UniqueComputerThresholdValue><!-- the new value -->100</UniqueComputerThresholdValue>
</DataSource>
</Discovery> |
The above discovery tells us that the new threshold values for all the 3 basic monitors is going to be 100 for all applications whose applicationName is foo.exe irrespective of their applicationVersion.
Let's look at the composition of the module type ApplicationThresholdDefiner used above.
|
<Composition>
<Node ID="DiscoveryMapper">
<Node ID="Filter">
<Node ID="OleDbProbe">
<Node ID="Scheduler" />
</Node>
</Node>
</Node>
</Composition> |
There are 4 modules used: Scheduler (System.Scheduler from System.Library MP) followed by OleDbProbe, followed by Filter (System.ExpressionFilter from System.Library MP) and then followed by DiscoveryMapper.
The Scheduler module is configured to 15 minutes in the MP and can be changed. What does this indicate? This indicates that if new applications are discovered by the system which meet the criteria defined by user, then in 15 minutes (worst case), a threshold override would also be submitted to the system.
The OleDbProbe module retrieves the managed entity id for one or more applications meeting given criteria.
The Filter module ensures that there's at least one application known to OpsMgr via AEM that meets given criteria.
The DiscoveryMapper module is responsible for submitting override data to the system in form of instances of the class Microsoft.SystemCenter.CM.AEM.MonitorOverride defined in Microsoft.SystemCenter.ClientMonitoring.Library MP.
Similar composition has been provided in attached MP to change threshold values for 1 or more error groups.
It should be noted that the state change for all error groups and applications is computed every 15 minutes. As a result, with the attached sample MP, it can take upto 30 minutes to reflect the override changes in the OpsMgr console.
This max. value can be reduced to 15 minutes by reducing the configuration of scheduler module from 15 minutes to 30 seconds or something like that. However, reducing the configuration for scheduler module would have an impact on performance as that determines how frequently the OpsMgr DB is hit with queries as defined in the compositions in the sample MP.
However, based on user requirements, the value can even be bumped up or changed to a schedule basis as well.