MS.COM Operations Tools Team WebLog

Hey - What does this button do?

My own curious nature

So without going too deep into the gory details, as I've mentioned before my team writes tools to support the Operations team for Microsoft.com.  My team and I have worked in this space for quite a while and have a pretty good idea of the needs of our engineers for running the site(s).  And believe me, they are pretty vocal when they need something different.  One of the things we know is that there are a lot of tools out there for managing parts of an enterprise operation.  We also know that some of these tools are pretty good for the segment they are trying to address.  The problem is...there are a lot of tools out there for managing parts...well you get the idea.  We are diligently working to bring all of these parts together and make some sense of it all. 

Ultimately, my goal is to be able to make what we are doing available to all of you, but I'm a realist.  While I can bank on certain facts like, oh I don't know, all of our servers are going to be running Windows, that's not something that all of you can guarantee.  Likewise, I can't say that even with the size of our enterprise, we will run into all scenarios that all of you have in your environments.  So my own curious nature is getting the better of me.  In your operations teams, what are the issues that are the bane of your existence?  Now please limit your responses to technical challenges that you run into on a continual basis or that are just significant enough to be really painful.  While I respect all of your opinions about choice of OS, platform, database, etc. I'm not trying to start a debate.  I'm really just trying to find out how we can help.  Maybe we've already written something that can help.  Or maybe we are missing something in our own operations that we don't know we're missing.  We're by no means perfect, but we've really learned a lot.  Now I want to learn more...from you.

Looking forward to hearing from you!

Scott

Published Tuesday, June 29, 2004 9:45 PM by mscomts

Comments

 

Phil Renouf said:

Well one of the guys that I work with just came up and asked if I knew of any ways to easily report on what MOM rules we are sending Alerts for. Basically we only send alerts for a subset of the rules that we collent (anything that will cause an outage etc.), and need to find a way to easily (and regularly) report on it without going through each rule and writing it down.

I haven't started looking into it, he and I are thinking this might be something we can pull out of the MOM database, but maybe we're missing a feature in MOM to do this easily. If not it'd be nice to see this as a new feature, or a small util etc.

Just stumbled across this blog, I think this is great. Would love to see the other Ops groups blogging on here!

Phil
July 20, 2004 11:28 AM
 

Will said:

Hey Phil,

If I read this correctly it looks like you are asking for a list of Event Processing Rules that generate an alert as part of the response. I’m truly hoping this is not the case, but I suspect that it is.

Here’s the rub. The ProcessRule table in the MOM database contains a number of columns including two columns of type Image. These two image columns contain all of the Properties and Criteria for a rule including whether or not to generate an alert in response to an Event. I’m sure I don’t have to tell you how exceedingly annoying this is.

Story the meat of a rule in these image columns prevents any form of mass change to any rules or set of rules, and processing on any of this data and any reporting on this data. The only alternative is to write a processor function that decompiles the data in these columns.

And of course, there is no way to get this data through the UI either. At least, none that I know if.

One of the things we have done in the past is to track the events that have already generated an alert. For us, we have found it more useful to know which Events actually generate alerts rather than which Events could generate alerts.

One of the common queries I use is:

SELECT DISTINCT Eventno, ProviderInstance, Source, IsAlerted
FROM EventReportView
WHERE IsAlerted = 1
AND Evtime > DATEADD(hour, -24, GETUTCDATE())

This gives me a list of all of the Events that generated an Alert within the last 24 hours.

Sorry I couldn’t give you better news.

Will
July 21, 2004 11:31 AM
 

Phil Renouf said:

Thanks for the info Will!

Unfortunately you were right, we need to report on what rules we've configured to generate an alert. It's a pain that you can't get at the information direct in SQL, going through each rule individually is a long, long process. Hopefully that's changed in the upcoming version. At least this explains why we couldn't find a way to get the data from SQL, thought I might have been blind ;)

Thanks for the alternate suggestion, I'll have to ask if getting a report of events that actually did trigger an alert will be good enough. We've got to do this as a part of the Sarbanes-Oxley reporting requirements so I'm not sure if that will suffice. Hopefully!

Thanks again!
Phil
July 23, 2004 7:52 AM
 

Will said:

Unfortunately this situation has not changed in the next version. All changes to the database for MOM 2005 are oriented around performance and scale. However, MOM 2005 will be shipping a raft or scripts and tools that will make an administrators life easier and will include facilities to export Event Processing Rules, and the properties of those rules, to a file.

I have made several unsuccessful attempts at deciphering the data in these image columns over the years and I’m fully convinced that it is only meaningful to the MOM Interface.

Here is what I can tell you in case you decide to invest the time into developing a solution:

The image column that contains the relevant date is the Properties column. In most cases a user created (or custom as it is sometimes referred to) Event Processing Rule will begin with ‘0xFFFFF’. For Event Processing Rules (custom or otherwise) that are configured to generate an alert (the checkbox is checked in the UI) you should see a ‘101’ toward the end of the string of glop. You can confirm this by manually checking the changes to the field. Pick an EPR and find that EPR in the UI and get the guid for the EPR from the ProcessRule table. Copy the contents of the Properties field for this EPR to Notepad or some other editor. Now change the state of the checkbox and apply the change. Re-query the table. Copy the contents of the Properties field to the same Notepad and manually compare the two strings.

In MOM 2000, I think it is the last 3 characters of an EPR are for Alerting. In MOM 2005 there is a bunch of additional data and the data is more buried inside the string.

But what I think you will see if that in one string, the string where the “Generate Alert” check box was checked, you will see the characters “101” toward the end of the string and in the string where it was unchecked you will see “001”.

What you could do from here is craft a SQL Query to find only the records that contain the ‘101’ string. It’s not easy and it’s not pretty but you might make it work.

Good luck,

Will
July 23, 2004 10:58 AM
 

Phil Renouf said:

Thanks again, that is a great amount of information to get started on trying to sort out a script or sql query to make this work. Thanks for the information, I'll let you know how it goes.

Phil
July 23, 2004 11:20 AM
 

MS COM Operations Tools Team WebLog My own curious nature | work from home said:

June 16, 2009 7:47 AM
New Comments to this post are disabled

This Blog

Syndication

Tags

No tags have been created or used yet.

News

All opinions posted here are those of the author(s) and are in no way intended to represent the opinions of our employer. This is provided "AS IS" with no warranties, and confers no rights. Use of included code samples are subject to the terms specified in the Terms of Use.

© 2009 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker