And if you thought the material portion of the TOC was getting expensive, wait until you try managing MOM.
MOM allows an engineer to separate the Agents into Processing Rule Groups (PRG) based on many different criteria such as Machine Name or a user-defined attribute (such as SQL Server, IIS Server, HP Server, OS Version, etc). This makes collecting specific counters and events for specific application type a breeze.
Each PRG can contain any number of Event Processing Rules, Alert Processing Rules or Performance Collection Rules. Event Rules look at an event source (such as the Application Log or WMI provider) for a particular event (based on Event ID, Event Source, etc) and when a match is found it processes that event (by launching an application, raising an Alert, executing a script, etc). If an Alert is raised then an Alert Processing Rule can be defined to further do some processing. And of course you can collect any number of performance counters.
But be wary, the amount of collection you perform directly impacts the “cache” limitation of Agent Managers. In our environment we have 1425 events and 180 counters (collecting every 45 seconds) configured over approximately 20 rule groups.
Now some of these PRGs have a large number of rules in them. Our event rules for the Compaq Insight Manager (CIM) is over 70 and our event rules for SQL Server are over 100. Trouble is, if you want to globally change a number of rules, you can’t. The UI simply does not allow you to multi-select rules. Worse yet, the rule definition is stored in the database as a binary field so you can’t even direct update the table to affect a mass change. Need to disable 60 of 70 events? Then you need to double click each rule, uncheck the “Enabled” checkbox and click OK. Repeat 70 times.
And Rules can perform a number of actions. Say you have an IIS application that reads and writes to a large XML file. The application is created such that an event is generated whenever there is a problem reading and writing to this file. An engineer could create a MOM Script (VB Script style) that would delete the original, faulty, XML file and copy a master of that XML file from a remote location. Very cool stuff.
But what if it is done wrong and instead of deleting that one file it accidentally deletes all files? And what if the engineer puts that Processing Rule into an Uber group for ALL servers in the environment?
So do you really want engineers making their own scripts, creating their own rules and group? No. But you do want them to manage the alerts for their spheres of influence and to do that you must give them access to the MOM UI. To some degree UI access and management can be mitigated through security groups created by MOM but we have found that this method is largely ineffectual.