Welcome to MSDN Blogs Sign in | Join | Help

Creating Rules and Monitors with a Schedule/Understanding XML internals!

The ability to build rules and monitors that have an associated schedule for operation was part of MOM 2005 and easy to implement.  As several other blog posts have pointed out, it is also possible to introduce such a schedule for rules and monitors in OpsMgr 2007. 

Rules - http://blogs.msdn.com/boris_yanushpolsky/archive/2008/09/19/configuring-rules-to-run-during-business-hours-only.aspx
Monitors - http://nocentdocent.wordpress.com/2009/01/20/running-a-monitor-during-business-hours/

The challenge with OpsMgr 2007 has been that it is not possible to introduce such a schedule directly on a rule or monitor in a sealed management pack – and there is no UI component in the OpsMgr console that exposes this functionality.  If there is a default rule or monitor where an operational schedule is desired the only option is to disable the rule or monitor and recreate it in a custom management pack with the desired schedule. 

Although there are other posts that address this topic none of them go in depth to explain the various elements that are part of a rule and monitor to create a schedule and none I’ve seen attempt to relate the authoring console back to the underlying XML.  A full understanding of how this works requires understanding both the authoring console components and the XML generated by the authoring console.  The goal of this post then will be to present the graphical means of creating a schedule while pulling in the XML to show how everything links together.  That said, let’s get started.

If you haven’t worked with the authoring console in R2 yet – get started.  It is a very flexible tool for authoring and I can think of very little need or argument that would justify continuing to author in the OpsMgr UI <OK, off soap box>…..

For our examples we will create 2 rules and 2 monitors. 

Rule 1 – Standard event rule with no schedule
Rule 2 – Standard event rule with a schedule
Monitor 1 – Standard event monitor with no schedule
Monitor 2 – Standard event monitor with a schedule

We will use standard event rules/monitors because they are common and easy to create.  The general principles presented here, however, apply to all of the various rule and monitor options. 

To get started, we launch the authoring console.

clip_image002
You will notice that there isn’t much that can be done with the authoring console until either a new management pack is created or an existing management pack is opened.  We will create a new one.  To do so, select File > New.  Supply an identify for the management pack, which will be it’s filename, select to create an empty management pack and click next.

clip_image004
Enter a display name for the management pack – this will be the name of the management pack as seen in the OpsMgr UI.  Once entered, select empty management pack and then select create.  This will create the new management pack and make it ready for editing. 

clip_image006
Creating the management pack should have placed the focus of the authoring console on the ‘Health Model’ node.  If it didn’t, select that now.  Before continuing let’s make one adjustment to the authoring console that will help us keep track of management pack versions.  From the menu select Tools > Options. This will bring up the screen below.  Select to ‘auto-increment’ the management pack version.  When importing management packs with changes you should ensure the version number reflects that ther is a revision but it’s easy to forget to do so.  Setting this option in the authoring console ensures you don’t forget!

clip_image008
OK, on to authoring.  We will start with rules.  In the Health Model pane click on rules and then in the center rules pane, right-click and select New > Alerting > Windows Event

clip_image010
On the general tab of the Windows Event wizard that comes up, supply the element ID – which is the internal ID of the event, the Display Name – which will be displayed in the OpsMgr UI, the target and the category.  Click net to proceed.

For this example I’ve chosen the Windows Server Operating System class as my target.  This class may not always be appropriate – make sure you understand how to properly target.  I have written an article in Technet magazine that describes targeting – available here – and have also written a chapter on targeting in the upcoming OpsMgr R2 Unleashed ebook.

clip_image012
On the Event Log Type page of the wizard, choose which event log contains the event of interest.  For our example we will use the application log – select it and click next.

clip_image014
On the Build Event Expression page of he wizard we configure the event of interest.  The default options are to choose an Event ID and also an Event Source.  For our example all we need is the Event ID so delete the line for Event Source, enter 1000 for the Event ID and click next.

clip_image016
The final page of the wizard allows configuration of alerts.  Input an alert name and choose finish.
clip_image018
With this complete we now have a simple event rule that will operating 24/7 to scan for an event 1000 on all Windows servers. 

Next we need to create our rule that will be modified to run on a schedule.  To do so, work back through the steps above but this time use different values as follows:

Element ID:  Custom.Event.Rule_Monitor.with.Schedule.Standard.Event.Rule.with.Alert.WITH.Schedule
Display Name:  Standard Event Rule with Alert – WITH schedule
Target:  Microsoft.Windows.Server.OperatingSystem
Category:  Alert
Log Name:  Application
Event ID:  1001
Alert Name:  Simple Alerting Event Rule - With Schedule

After completing your second rule the Authoring Console should appear as follows:
clip_image020
Now, let’s modify the second rule to only be active during certain times of the day.  To do so, select the second rule, right-click and select properties and then select the module tab.

In case it’s not obvious from the screen shot – note that there are no tabs to allow editing the event log, event ID or alert properties.  What the wizard did was create MODULES and plug them into the rule where required.  The section of the wizard where we configured to use the application log and to look for event 1001 (remember, we are looking at the properties of rule 2) actually created a data source (DS) module leveraging the built in Microsoft.Windows.EventProvider component.  There are lots of other Data Sources but the one selected is the one needed to pull information from the event log.  If you were to edit the Data Sources section you could edit these settings.  Further, creating the alert in the wizard caused an Actions module to get created and plugged in as shown leveraging the built in System.Health.GenerateAlert component. 

clip_image022
Notably vacant is the condition detection section.  This is the section that allows us to leverage the schedule based component of OpsMgr.  Let’s create it.  On the condition detection selection select Create.  This action pulls up a list of condition detection components available.  There is a long list and all of these should be self-explanatory.

clip_image024
The one we are interested in is the scheduler so put in schedule in the look for box, select the only entry that remains in the list and in the Module ID section label this as CD (short for condition detection – you can use whatever you want in this box but CD is common). Select Ok when finished.

clip_image026
With this complete our modules screen now has a condition detection.  Now we have to edit our condition detection to specify it’s settings.  Click on edit under the condition detection section.
clip_image028

The edit button brings up the configuration screen as shown.  From here, select configure.

clip_image030
The configure page is where the schedule options are set.  The schedule shown means that this event rule will only be PROCESSED from 6:00 AM to 11:00 PM on Monday, Wednesdays and Fridays. 

I capitalize PROCESSED for a reason.  When we create a schedule it’s natural to think that the rule or monitor is being turned off outside of these time frames.  This is not correct.  What we are creating is a condition detection module which means that the event will still be detected 24/7 on the agent and sent to the condition detection module.  Based on our settings and assuming  the event is picked up outside our schedule, the module will return that the data should not be handled and discard it – but the workflow will still fire up initially!  A minor distinction but one worth understanding.

Note also that the wizard allows for choosing times that the rule should be processed as well as when it should not.

clip_image032

There is one more thing we need to do for our scheduled rule in order for it to work.  Boris points this out on his blog which is where I discovered it!  There is a bug in the authoring console where a needed snip of XML is not added.  Back at the main schedulefilter window, select edit to pull up the XML section we need to modify.

clip_image030
After setting the schedule and clicking edit, the XML section that shows up for editing will be as follows. 

<Configuration p1:noNamespaceSchemaLocation="C:\Documents and Settings\Administrator.STARTREKNG\Local Settings\Temp\1\CD - System.SchedulerFilter.xsd" xmlns:p1="http://www.w3.org/2001/XMLSchema-instance">
  <SchedulerFilter>
    <ProcessDataMode>OnSchedule</ProcessDataMode>
    <Schedule>
      <WeeklySchedule>
        <Windows>
          <Daily>
            <Start>06:00</Start>
            <End>23:00</End>
            <DaysOfWeekMask>42</DaysOfWeekMask>
          </Daily>
        </Windows>
      </WeeklySchedule>
      <ExcludeDates></ExcludeDates>
    </Schedule>
    <TimeXPathQuery>TimeXPathQuery</TimeXPathQuery>
  </SchedulerFilter>
</Configuration>

All that is needed is a simple change, replace TimeXPathQuery with UseCurrentTime as shown.  Note that editing in XML for use in the auth console is case sensitive.

<Configuration p1:noNamespaceSchemaLocation="C:\Documents and Settings\Administrator.STARTREKNG\Local Settings\Temp\1\CD - System.SchedulerFilter.xsd" xmlns:p1="http://www.w3.org/2001/XMLSchema-instance">
  <SchedulerFilter>
    <ProcessDataMode>OnSchedule</ProcessDataMode>
    <Schedule>
      <WeeklySchedule>
        <Windows>
          <Daily>
            <Start>06:00</Start>
            <End>23:00</End>
            <DaysOfWeekMask>42</DaysOfWeekMask>
          </Daily>
        </Windows>
      </WeeklySchedule>
      <ExcludeDates></ExcludeDates>
    </Schedule>
    <UseCurrentTime>true</UseCurrentTime>
  </SchedulerFilter>
</Configuration>

You only need to change this section for the RULE – not the monitor!

Save the XML section – if there are errors you should get a warning. 

With our edits complete, select OK back to the main authoring console screen and save the current version of our management pack.  Now that we’ve built the schedule in the UI it’s time to look at the resulting XML.  It’s not that bad so don’t be nervous!

With the management pack saved, open it in your favorite XML editor – I use Notepad++.  This is a simple management pack – just two rules so far.  When you open the XML it will likely not be collapsed – to me it always helps to start with the XML collapsed to it’s major sections and start to drill in from there.  Depending on the editor, you may not be able to do this.  The collapsed view in Notepad++ makes our ‘scary XML’ management pack look quite simple – and it really is.

clip_image034

The two rules we are interested in are in the Monitoring Section.  Lets expand just the monitoring section and take a look at our two rules one at a time.  The first thing we need to understand is that every rule and monitor MUST have a data source.  It may not be obvious in the case of monitors, but there is always a datasource, 

The XML below is for rule 1.  It is a bit busy at first glance but the goal of this diagram is to demonstrate how the UI elements (both OpsMgr UI and Authoring Console UI) are represented in XML.  I also include the expanded DisplayStrings section of the XML.  If searching the XML for the name of a UI element (rule/monitor/view/task/etc) you likely will first find a hit in the display strings section and will need to track back from here.  Each UI element is represented by a unique ID that does not appear in the console.  The DisplayStrings section ties together the console name and the element ID name.  Once you discover the element ID name, use it for further searching and you will find the UI element of interest.

clip_image036
In the example above the initial line of the Rule definition runs off the screen as indicated by the line trying to correlate the UI target element to where target is shown in XML.  The full line is below for reference.

<Rule ID="Custom.Event.Rule_Monitor.with.Schedule.Standard.Event.Rule.with.Alert.NO.Schedule" Enabled="true" Target="Windows!Microsoft.Windows.Server.OperatingSystem" ConfirmDelivery="true" Remotable="true" Priority="Normal" DiscardLevel="100">

OK, so hopefully that helps explain how the XML fits in to the picture – but that was just rule 1.  We haven’t seen the XML for rule 2 yet – our scheduled rule.  The only thing different about rule 2 is that it adds a condition detection section.  The XML is below – since we have so many elements in common between rule 1 and 2 I’ve avoided commenting the common elements again and instead highlight only the schedule specific elements.

image
So completes our example with rules – hopefully you have a good understanding about the structure of the XML from this example.  On to monitors.  Building the monitors is also easy but there are more complexities both in the UI and in the XML.  Like our rule example, lets start by building a simple event monitor in the UI that operates 24/7.  In the authoring console, select the Health Model node, select monitors and in the center monitors section, right-click and select new > Windows Events > Simple > Event Reset as shown

clip_image040
On the general tab enter values for the monitor element ID, the display name, choose a target – Windows Server Operating System in this example, choose System.Health.AvailabilityState as the parent monitor and then choose a category for the monitor – Availability Health in this example, and click next.

clip_image042 On the next two screens, configure an event log and event for our unhealthy event – in this case the application log and event 1003 and click next.

clip_image044

clip_image046
On the next two screens, configure the event log and event that will trigger state back to healthy.  In this case, the application log and event 1004 and click finish.
clip_image048

clip_image050
We stated that this should be an alerting monitor – and there was no UI to configure alerting so once the monitor is saved go into properties of the monitor, select the alerting tab and configure the monitor to generate alerts and to generate an alert when the monitor is in the warning health state.  Once complete, select OK to close the monitor.

clip_image052

With this complete we have a simple event monitor that will operate 24/7 and detect an event 1003 (Unhealth) and 1004 (Healthy), adjusting state accordingly.

Next we need to create our monitor that will be modified to run on a schedule.  To do so, work back through the steps above but this time use different values as follows:

Element ID:Custom.Event.Rule_Monitor.with.Schedule.Standard.Event.Monitor.with.Alert.WITH.Schedule
Display Name:  Standard Event MOnitor with Alert WITH Schedule
Target:  Microsoft.Windows.Server.OperatingSystem
Parent Monitor:  System.Health.AvailabilityState
Category:  AvailabilityHealth
Unhealthy Event Log:  Application
Unhealthy Event:  1005
Health Event log:  Application
Healthy Event:  1006

Once complete, remember to go back to properties and configure the new scheduled monitor for alerting on warning state.  Once the two monitors are built the authoring console UI should appear as follows:

clip_image054

OK – now lets add our schedule to the second monitor – are you excited?  :)  Remember earlier that I mentioned that every rule and monitor has a data source and that the data source can vary depending on where we are obtaining data – it can even be custom.  For monitors the data source is contained in a section called the monitortype.  So, in order to introduce a schedule (and other modifications – but thats beyond the scope of this blog entry) we are REALLY interested in the monitor type because that module is where we introduce our customizations. 

The first thing you need to know is the name of the monitor type that your monitor is referencing.  This is found on the definition line for the monitor and generally refers back to a system management pack.  We will look at the full XML in a minute but for now I’ve copied the monitor definition line.  We can tell that the monitortype (typeid) is the Microsoft.Windows.2SingleeventLog2StateMonitorType.  We also see the red section tells us that this monitortype is defined in the Microsoft.Windows.Library system library.  The red part of the monitortype definition is an alias that is defined in the references section of the XML.  I’ll show you in the XML shortly how all of this links up.

<UnitMonitor ID="Custom.Event.Rule_Monitor.with.Schedule.Standard.Event.Monitor.with.Alert.WITH.Schedule" Accessibility="Internal" Enabled="true" Target="Windows!Microsoft.Windows.Server.OperatingSystem" ParentMonitorID="Health!System.Health.AvailabilityState" Remotable="true" Priority="Normal" TypeID="Windows!Microsoft.Windows.2SingleEventLog2StateMonitorType" ConfirmDelivery="true">

Our event monitor works because it references the stated monitortype in the sealed system management pack.  As mentioned, the monitortype section is the place where we want to introduce our customizations – but since it’s in a sealed management pack we can’t adjust it there (we wouldn’t want to anyway).  The solution is to find this monitortype in the system management pack and copy it to our management pack and make an adjustment to use the local copy rather than the system copy.

The needed monitor type is in the microsoft.windows.library management pack (we know this because of our references) – open that system management pack (which will require you convert it to readable XML – Kevin has a good blog on how to export management packs here.  The Monitor Type of interest,  Microsoft.Windows.2SingleEventLog2.StateMonitorType is shown from the microsoft.windows.library management pack.  

clip_image056

All we need to do is paste this section inside our current management pack as shown.  Note that only enough of the management pack XML is shown to demonstrate where to place the section correctly.

clip_image058

If you jump ahead of me and try to save and import the management pack at this stage, it will fail!

Note that there are two more curve balls we have to deal with as shown in the XMLt.  These are dependencies for this moduletype – one is internal to the microsoft.windows.library management pack (note no alias) and the other is in the system (system.library) management pack. Dealing with these issues is very easy.  Let’s take them one at a time. 

Microsoft.Windows.BaseEventProvider – note that there is no alias on these two entries.  Why?  Because the expectation is that the definition for this provider is in the same management pack as was the monitortype we copied.  Remember that the only component we are interested in is the monitortype module so it’s completely fair to simply make an edit to add an alias to refer back to the microsoft.windows.library management pack. 

System!System.ExpressionFilter – In truth these entries wouldn’t cause an error.  Why?  Note that there is already an alias (system) defined on this entry that indicates these filters are in the system.library management pack.   Since the alias is already defined in our management pack it should work fine.

clip_image060

I’ve made the required changes to the XML below (circled in red) and now, if you try, the management pack will import – but we aren’t done with it yet!

clip_image062

OK, with all of the changes made there is one VERY IMPORTANT change we haven’t made yet.  We have to configure our scheduled monitor to use the monitor type we copied into our management pack.  To set this monitortype apart we will rename it and we will also reconfigure the unit monitor to use it.  The changes required are below.

clip_image064

Drum roll please – NOW we can open the XML in the authoring console to see what we’ve actually done.  Make sure you save the management pack with all of the changes and then open it up.  Go to the type library node under monitor types and we see that we have a monitor type defined!

clip_image066

Now, once we know how all the pieces fit together we could have just as easily built the monitor type totally in the UI – but remember I said it helps to know the UI and XML?  This is why – often you will go ‘XML diving’ to come up with an example of how an item should be configured!

So is it always going to be this involved to add a schedule?  No – the more you do this you will find shortcuts – and you can built monitortypes by hand once you have experience – but this is the easiest way to demonstrate how everything comes together.

Let’s see what changes resulted in the monitor type as a result of our XML editing.  Select properties on the monitortype.  First, notice on the General tab that we have an ID for our monitortype but there is no assigned name.  Why?  Monitortypes are not visible in the OpsMgr UI so there is no need for the name field to be complete in XML.  The authoring console, however, requires that this field be complete or it won’t allow changes made in this section to be saved.    Also notice an option to select what runas account this monitortype should use.  If specific credentials are required, select an appropriate runas account.  Our example uses the default.

clip_image068
On the states tab we see  that this monitortype is defined as a two state monitor.  The ID’s listed can be customized but doing so would require additional edits to the monitor itself which references these values.

clip_image070
The member module tab is where things get interesting. This is where all of the modules that make up our monitoring workflow are defined.  We can see that our datasource and filter modules are defined.  Let’s stop there for a minute.  Notice that there are two data sources and two filter modules.  Why?  Because this monitor type is a 2 state event monitor – meaning that there are two event logs and two event ID’s that need to be evaluated.  This, two data sources and two filters! 

clip_image072
Staying with the member module tab, this is where we need to add a module to handle scheduling of our monitor.  Select add and enter schedule in the look for box.  Notice that 4 options match the schedule filter.  Which one should we pick?  Looking at the role for each option the answer is quickly clear – we are building a condition detection schedule filter so we will choose the condition detection filter – and there is only one.  Once selected, enter OperationalScheduleCD as the module ID and select OK.

clip_image074
After selecting OK a screen should appear allowing you to configure the new filter.  This is now very similar to what we did with our rule.  Select configure and add a schedule where our monitor operates daily from 6:00 am to 11:00 pm as shown.

clip_image076
clip_image078
Once complete, the member module tab will look as follows

clip_image080
On the Regular tab we pull our modules just created together to form our workflow.  Notice that there are two items we have to handle – a condition were the first event is raised and a condition where the second event is raised.  This brings up an interesting point.  The goal here is to schedule our monitor so that it is only active during certain hours.  But, the REAL goal very likely is to prevent teams from getting alerts/pages when monitor is supposed to be off.  If, however, the monitor were to detect a condition where a healthy state resumed while monitoring is disabled, we likely would want to process that change.  It all boils down to how the schedule module is implemented. 

clip_image082

The first event raised is the event that causes state to change to unhealthy and alerts (and potentially pages) to fire.  We definitely want to include our schedule filter in the first event workflow.  To do so, put a check mark by the member module labeled CD (Be sure FirstEventRaised is highlighted) and adjust the workflow to pass information as follows

FirstDataSource ---> FirstFilterCondition
FirstFilterCondition ---> OperationalScheduleCD
OperationalScheduleCD ---> Monitor State Output

You could also have setup the flow differently so that the operational schedule came immediately after FirstDataSource.

With these settings the FirstEventWorkflow will only complete and output data during schedule hours.

clip_image084
There is also our SecondEventRaised workflow.  This is the one that will detect an event to put the monitor back in a healthy state so, to me, it doesn’t make sense to put this workflow on a schedule since we want to know that health has returned regardless of when and also because if we miss the healthy event there is no guarantee it will be produced again!.  Based on that, no modifications are required for the SecondEventRaised workflow. 

clip_image086
The remaining tabs are not relevant to building our scheduled monitor so we will pass them for now but the On Demand is interesting and deserves some comment.  In the OpsMgr UI health explorer section there is an option to recalculate health.  While this option is present and can be selected for every monitor in health explorer it does not work with every monitor – only those that are built to support On Demand health recalculation!  Ah, so THAT is what the On Demand node in our monitortype is for!  Correct.  IF you want your monitor to support On Demand health recalculation you have to configure the workflow to allow for it on the On Demand tab.  Our monitortype doesn’t support On Demand recalculation so we are passing it here.  A further wrinkle – not every monitortype CAN support recalculation – only those that use a probe based data source. 

OK – BIG EXHALE!!!!  We have now completed building our scheduled monitor!  WooHoo, celebration and excitement!  Before getting too out of hand, save your work in the authoring console and we will take a look at what we’ve done in the XML and see how all of our UI configurations maps out.  Like before, I’ll start with our monitor that doesn’t have a schedule to show the core UI components used to build a monitor and how they map to XML.  I’ll then show the XML specific to the monitortype we built, scheduling and how all of those changes link into the monitor.  Also, I will only be showing the portions of XML that are applicable to the subject at hand – some areas will be collapsed or omitted.  Lastly, for completeness I turned on word wrap so we will be able to see the full text rather than having some scroll off the screen.

clip_image088
OK, thats enough to numb the senses – lets dig in a bit deeper!  Now that we understand how to correlate a basic event monitor built in the UI to it’s resulting XML, let’s look at the XML that specifically ties a monitor to it’s monitortype and associated configuration.  As shown above, monitortypes generally are defined in the system management packs and accessed by reference.  For our scheduling example, we copied the needed monitor type and made some adjustments.  Here’s how it all fits together.  Note again that certain sections that aren’t applicable or that we have already covered are collapsed.

clip_image090
And thats it!  Hopefully this helps illustrate how XML links together and correlates to the UI.  the sample management pack that was built as part of this blog post is attached for reference.  Happy authoring/scheduling!

SCCM: Forcing a Task Sequence to Rerun

There are well known methods to force an advertisement to rerun – including several add-on tools available for the SMS or SCCM console.  To date, however, there are not equivalent methods to force a task sequence to rerun.  Part of this may be because task sequences are typically thought of as focused on Operating System Deployment (OSD) and rerunning these types of distributions are not as common as rerunning advertisements.

While task sequences are the best solution out there for OS Deployments they are much more flexible than just that – including distributing software in very complex scenarios including support of dynamic decisions during execution, handling reboots, enabling specific sequencing of application deployment, etc.  With this kind of power many organizations are using task sequences for software deployment and the ability to force a sequence to rerun on a selective basis and without having to manually logon to individual clients is crucial.  The process to make this happen is very easy.

First, identify your task sequence by ID.  My test sequence is CEN00027.
clip_image002

 

 

 

 

 

 


Note that in my lab this sequence has already run in the past.
clip_image002[7]

The advertisement for the sequence is set with a mandatory execution time – which resulted in the first run.  No other mandatory times have been added.  Further, the advertisement is set to allow rerunning.  If you had an advertisement set to not rerun you should be able to force it to rerun but this would likel require additional WMI and registry edits.  I haven’t tested that specific scenario.
clip_image002[12]

From here, open WMI on the client system of interest and connect to the root\ccm\scheduler namespace.
clip_image002[14]
Click ‘Enum Classes’, select Recursive and then scroll to the bottom and double click on CCM_Scheduler_History() and then click instances.
clip_image002[16]
clip_image002[18]
clip_image002[20]
In the list that shows up, find the entry that corresponds to your task sequence ID and delete it.
clip_image002[22]
With the deletion made, restart the SMS Agent Host service (CCMExec) on the target client.
clip_image002[24]
In a few minutes, the program balloon will pop up indicating the sequence is about to run again.
clip_image002[26]
The process described is manual but could be automated if desired. 

BING 411

Completely unrelated to anything SCCM or OpsMgr but still worth a post – do you find yourself avoiding dialing information to get phone numbers of information because it is too expensive?  Check out BING 411.  This service is completely free, will provide detailed information on whatever you are searching for and will also connect the call for you – all free.  And, to top all of that, the system is actually good – voice recognition is very accurate.  Try it out – 1-800-BING-411.

Posted by steverac | 0 Comments

Understanding Monitors in OpsMgr 2007 part III – Dependency Monitors

This is part 3 (and last) of my series of posts describing monitors.  The first post, found here, discussed unit monitors, which are the engine of monitoring. The second post, found here, discussed aggregate monitors, the ‘umbrella’ monitor sitting above unit monitors and reflecting their collective health forward – ultimately all the way to the class level. 

The dependency monitor is used to link classes that are in a hosting or containment relationship together, allowing health state from one class to affect the health of another class higher in the relationship structure than itself.  This type of monitor is confusing for many an OpsMgr admin.  Looking at the relationship structure of classes within OpsMgr one might think that the rollup of class health happens by default – it doesn’t.  Health will only roll up to the individual class level (by using unit and aggregate monitors) unless a dependency monitor is configured between classes.  An illustration from the Opsmgr Authoring Guide might help and is shown below.  In this scenario we have two objects, the SQL Server 2005 object (SQL Server 2005 DB Engine class) and the SQL Server Database object (SQL 2005 DB Class).  Using the unit and aggregate monitors, both objects can be monitored individually – but there is nothing that will allow problems with a database (SQL Server 2005 DB class) to reflect on the SQL Server 2005 object itself (SQL Server 2005 DB Engine). If we do need the ability to link health between the objects, the dependency monitor is the mechanism to do so.

image

The diagram shows a dependency monitor created that links the health state between the availability aggregate of the SQL Server Database object (SQL 2005 DB object) and rolls it up under the availability aggregate of the SQL Server 2005 database object (SQL 2005 DB Engine class).  You could link between the other aggregate categories or on the aggregate for the class itself, it’s up to you.

So how do we build this in the OpsMgr UI?  Let’s walk through it.

First, find the two classes you want to link together and evaluate what monitors are already in place. In this case, SQL 2005 DB and SQL 2005 DB Engine

image

We see that there is a Database Status unit monitor already configured and it will rollup it’s health to the availability aggregate and ultimately the entity aggregate but there is no dependency to roll the state up further.  Knowing that this view in the UI isn’t always as complete as we would like (more on that in a minute) and knowing that dependencies are created at the parent class in the relationship, in this case SQL 2005 DB Engine, we take a look at the SQL 2005 DB Engine objects in discovered inventory and look at health explorer for these objects and confirm there is no dependency.

Note:  Here we are specifically looking at the availability aggregate since the unit monitor we want to ‘link’ into is underneath this aggregate in our target class.

image

image 
      Note:  Here I’m looking at health explorer for my test computer object but it really   
      doesn’t make a difference which object you choose out of discovered inventory since   
      the rollup, when created, will be at the class level and will be displayed on all objects as
      a result.

From this we know that there is no dependency, so we will build one.  Back in authoring we select to create a dependency monitor.

image

     Note:  You could also create the dependency monitor (or any other monitor) directly on 
     the node of interested which will fill in some of the information for you in the wizard.

image

As shown above, the target for the monitor will be the class where we want health to ROLL UP and not the class that is reporting the health.  In this case, SQL 2005 DB Engine will be our target.

The next page is where we link our classes

image

In the above screen we are configuring our dependency monitor to link into the SQL Database child object and consider the health state of this child object when calculating health of the hosting class.  Notice that in this window we are actually linking to the more generic SQL Database object (the only option) rather than to the specific SQL 2005 Database object. 

OK, wait a minute – this is confusing.  So we want to create a dependency to link the health that results from the ‘database status’ unit monitor that is specifically created to monitor objects in the SQL 2005 DB class – but our dependency can’t link directly to the SQL 2005 DB class but, instead, has to link to the SQL DB class?  Come on Steve – I look directly at the SQL DB class, the one we can link to, and it doesn’t have ANY monitors defined on it.  You also said earlier that without a dependency monitor health rolls up to the top of the CLASS and stops.  In this example that means health would not go past the SQL 2005 DB class so how can I get ANY health rollup by building a dependency to the SQL DB class which has no monitors?  Yes, I understand the confusion but, trust me, it works.  All of what I mentioned earlier is true.  Health doesn’t roll up from class to class unless there is a dependency.  You might call me a liar if you were to cause the database status monitor to go red and then look at the SQL 2005 DB class and the SQL DB class in health explorer.  The screenshots below seem to contradict what I’m saying because when the Database Status monitor is healthy, both classes are healthy and when the Database Status monitor is unhealthy, both classes are unhealthy.

SQL 2005 DB Class                                   SQL Database Class

image image

image image

While my statement that health will NOT rollup past the individual class level without a dependency is true it is also true that in some cases health generated on one class will reflect, not rollup, on it’s parent class.  The reason for this is a bit complicated to explain but a good way to predict when this will be the case is where one class specializes another class but what is ultimately being described is the same.  In this case, the SQL 2005 DB class is a more specialized for of the SQL DB class so the type of special relationship we are talking about exists.  OK, lets continue building our dependency monitor!

On the next screen we simply configure how health should be considered.

image

So what impact did this change have?  Take a look back at our two classes and we will now see the dependency monitor.  While the UI doesn’t clearly reflect it we now know based on our configuration and earlier discussion that the new dependency monitor will evaluate the health state of the availability aggregate from our child class when evaluating health.

So with the dependency monitor in place here is how health rollup really happens.  The health state changes on our unit monitor, rolling up to the availability agregate in the SQL 2005 DB class.  Because or our ‘special’ situation with these classes, health from SQL 2005 DB availability aggregate will reflect on the availability aggregate of the SQL Database class.  Our dependency monitor, created on the SQL 2005 DB engine availability aggregate, will ‘link’ into the availability aggregate defined on the SQL Database class and ‘roll up’ health according to the rules defined on the dependency.

image

For another view of this rollup taking place we look at health explorer for the SQL 2005 DB Engine class.  Health explorer is generally the clearest way to see these rollups and make sense of them.

image

Earlier I mentioned that using the authoring section of the UI to try and follow a dependency monitor and understand the classes it links together is potentially problematic.  This is illustrated by the above diagrams.  The dependency is created at the SQL 2005 DB engine level – and the dependency is visible at this level.  But looking at the class the dependency links to alone doesn’t give any ability to know that a dependency is actually in operation.  Health explorer is the right way to visualize the total picture of the health structure.  It still may be difficult to see at a glance which monitors are aggregates or dependencies but by working with the structure a bit, it isn’t that difficult to follow.

Posted by steverac | 0 Comments
Filed under:

Categories Filter Missing in OpsMgr R2 Notifications

In OpsMgr SP1, the option to filter notifications by category was available in the UI. 

image

In R2, many changes were made to improve the flexibility of notification subscriptions but the category option was not included. 

image

Does that mean you can no longer create notifications filtered by category?  In the UI, yes.  But it is possible to include the category filter if you edit the XML itself.  All notification detail in OpsMgr is stored in the ‘Notifications Internal Library’ management pack.  Just export this management pack and you are able to add back your category filters – an example is shown below.  The section specifying category criteria is in italics.  Note that if you choose to make edits to re-enable category filtering you should no longer plan to edit your subscriptions in the UI.  If you do and save a subscription setting, the category sections will be overwritten.

<Monitoring>
  <Rules>
    <Rule ID="Subscription04bafdd7_653e_46c1_aa0f_202d186150dd" Enabled="true" Target="Notification!Microsoft.SystemCenter.AlertNotificationSubscriptionServer" ConfirmDelivery="false" Remotable="true" Priority="Normal" DiscardLevel="100">
      <Category>Notification</Category>
      <DataSources>
        <DataSource ID="DS1" RunAs="SystemCenter!Microsoft.SystemCenter.DatabaseWriteActionAccount" TypeID="SystemCenter!Microsoft.SystemCenter.SubscribedAlertProvider">
          <AlertChangedSubscription Property="Any">
            <Criteria>
              <Expression>
                <And>
                  <Expression>
                    <SimpleExpression>
                      <ValueExpression>
                        <Property>Severity</Property>
                      </ValueExpression>
                      <Operator>Equal</Operator>
                      <ValueExpression>
                        <Value>2</Value>
                      </ValueExpression>
                    </SimpleExpression>
                  </Expression>
                  <Expression>
                    <Or>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Priority</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>2</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Priority</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>1</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                    </Or>
                  </Expression>
                  <Expression>
                    <Or>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>PerformanceCollection</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>Operations</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>EventCollection</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>StateCollection</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>SoftwareAndUpdates</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>Alert</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>System</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>Custom</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>AvailabilityHealth</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>PerformanceHealth</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>ConfigurationHealth</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>SecurityHealth</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>Discovery</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>Notification</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>Maintenance</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                    </Or>
                  </Expression>
                  <Expression>
                    <Or>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>ResolutionState</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>0</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>ResolutionState</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>255</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                    </Or>
                  </Expression>
                </And>
              </Expression>
            </Criteria>

Posted by steverac | 0 Comments

Understanding Monitors in Opsmgr 2007 part II Aggregate Monitors

This is part 2 of my a series of posts describing monitors.  The first post discussed unit monitors and can be found here.  This post discusses aggregate monitors.  As mentioned in the first post, unit monitors can be thought of as the workhorse of monitoring.  Unit monitors are just that – a unit of monitoring.  A self contained engine to monitor a specific item and reflect the result in terms of health state, alerting and diagnostic/recovery.  Aggregates act as a collector and consolidator of information and ultimately reflect the collective result of unit monitors.  For any defined class in OpsMgr there are 5 defined aggregate monitors – Entity Health, Availability, Configuration, Performance and Security.  This is shown below for the Windows Computer object.

image

Until now you might have just thought of these as categories you can use for grouping similar unit monitors together – and they are useful for this – but these are much more than categories.  If we look at the properties of the availability aggregate as an example we quickly see that this monitor itself can be an engine for alerting and is configurable to reflect the health of it’s contained unit monitors.  We even have diagnostic and recovery options available for an aggregate monitor.

image

image

So when we configure a unit monitor we now understand that the setting to specify a parent monitor isn’t just cosmetic – it’s important.  This setting directly dictates where an unhealthy unit monitor will have an impact.  If we choose availability, the health state of our unit monitor (and all others under the availability aggregate) will be ‘watched’ and their collective health ultimately ‘rolled up' to and reflected on the aggregate itself.  This offers some interesting possibilities.  If, for example, you don’t want to generate an alert based on a single unit monitor but would prefer to alert only when all of the unit monitors are unhealthy, the aggregate allows you to do just that!

So we have the availability, configuration, performance and security aggregates and we understand that these are default aggregates that are part of every monitoring object and we understand that unit monitors that are configured under each directly ‘roll up’ their collective health to the aggregates.  In addition to this, we can create our own aggregates and plug them into the monitoring structure.  So, for example, if we wanted to subdivide our unit monitors under the availability aggregate we could create another aggregate inline as shown.

image

Here we have 6 unit monitors that operate and generate state that is collected by the FileSystem Monitors aggregate.  So, the lower six unit monitors do not of their own operation have direct impact on availability health.  In this example, if any of the 6 unit monitors under FileSystem Monitors go unhealthy that state will be reflected on the FileSystem Monitors aggregate itself and the health of the FileSystem Monitors object will be reflected forward to the general Availability aggregate.

OK, so we have the four categories of unit monitors and we know we can add our own aggregate monitors and specify where they should plug into the category model.

image 

image

But, even these four top level categories of aggregate monitors themselves combine and are rolled up to the top most aggregate for a class – the entity aggregate.

image

So, ultimately, the overall health of the availability, configuration, performance, security and whatever other custom aggregates are in place on an object are taken into account based on their ‘watched’ unit monitors and rolled forward to the Entity aggregate.  The Entity aggregate will reflect the overall health of the object in health explorer so at a glance we can tell if the monitored object, in this case Windows Computer, has any issue causing it to be unhealthy.  I’ve put health explorer for a Windows Computer object next to the same object in the authoring > monitors view to show how they map together.

image

So we have the health of all unit monitors targeted to a particular class rolling their state forward to finally be shown in the health of the class itself.  So once we have the health reflected on the class, where do we go from there?  A common thought is that the health just continues to roll up along the relationship chain between objects.  That is not correct.  In terms of health state, the buck stops at the class itself.  But, wait – you are about to scream that you have seen an unhealthy class roll up to and impact the health of a class higher up in the relationship chain….yes you have – but that isn’t done automatically.  It requires dependency monitors and they will be the focus of part III of our series.

Posted by steverac | 0 Comments
Filed under:

SCCM AMT Provisioning Flowcharts

Support for AMT provisioning through SCCM was introduced in SCCM SP1.  Being able to manage systems through AMT is extremely useful and there is a good amount of documentation already about using SCCM and AMT.  Working through this technology I thought it would be helpful to have flowcharts that document how provisioning works both with SCCM natively supported AMT systems (AMT firmware version 3.2.1 forward) vs. legacy provisioning (firmware less than 3.2.1).  There are two provisioning options – in band and out of band.  The flowchart goes into depth on each and includes sample logs and step by step processing during the provisioning process.  I posted another set of flowcharts on my blog previously detailing the function of other SCCM components such as NAP, SUM, DCM, etc.  I released those in an html format and have since been asked many times for the Visio source files I used to create the HTML.  So, this time I’m posting the Visio source but understand that these flows are designed to be used in web format so you can dynamically navigate from page to page.  When you download the attached visio, open it and save it as a web page.  From there you will be able to navigate as if it were hosted on a web site but don’t have to hassle with actually posting it to a website.  Some sample screenshots are below.

image

image

 image

Posted by steverac | 0 Comments
Filed under:

Attachment(s): SCCMAMTProvisioningFlows.zip

Understanding Monitors in OpsMgr 2007 Part I – Unit Monitors

At MMS 2009 I presented a topic on understanding monitors in OpsMgr 2007.  According to the feedback, that session was well received so I thought I would convert some of it to a series of blog entries.

There are two key ways of delivering monitoring in opsMgr 2007 – rules and monitors.  At first glance, rules appear to deliver much the same monitoring as monitors. 

image image

There are some similarities for sure but rules and monitors are actually very different things.  There are two major things to understand about rules compared to monitors.  Rules have zero impact on measuring the health of the object being monitored.  In addition, rules can collect data and monitors don’t.  An example is instructive.  If you have a need to both collect performance data and also have the measurement of the same performance data impact the total health of the monitored object, you need both a rule and a monitor.  Why?  Again, monitors don’t collect anything – they just evaluate the data live and reflect back what is found in terms of health state changes.  Rules collect data but have no impact on health state.  So, in this example scenario, you need both.

Notice that in the preceding paragraph I made reference to monitors and rules being associated with an object.  What I really am talking about here is the class (aka, object) against which a rule or monitor is targeted.  Understanding targeting is pivotal to understanding OpsMgr 2007.  If targeting is a confusing topic for you or you want to refresh yourself on proper targeting techniques, take a look at the article I published in Technet magazine discussing this topic in detail - http://technet.microsoft.com/en-us/magazine/2008.11.targeting.aspx?pr=blog

Enough about rules – the topic at hand is to discuss monitors!  Monitors are where you see the power of OpsMgr 2007 and, from the list above, you can see that there is substantially more flexibility when using monitors vs. rules.  Remember, monitors are al about health and that is the goal of OpsMgr 2007.  To restate, monitors watch whatever they are monitoring – performance data, WMI, event logs, whatever – and tell administrators about the results of monitoring by changing the health state of the object being monitored.  One point here – on the rules node you see a category for alert generating rules.  Monitors can definitely generate an alert as well so don’t think you are missing out on that ability by choosing a monitor! 

There are three categories of monitors – unit monitors, aggegrate monitors and dependency monitors. 

Unit Monitors
Unit monitors can be thought of as the workhorse of monitoring and unit monitors drive health detection in OpsMgr.  Without unit monitors you would never know a problem exists!  The best way to get to know unit monitors is to work with them.  A caution here – make sure you do your testing in a lab environment as if operating in the production environment, any changes made take place right away.  If building multiple monitors for testing this could cause notable churn in the production environment – unexpected churn is far easier to absorb in a test lab!.

There are lots of options for unit monitors – ranging from very straight forward to very complex.  Discussing each and every unit monitor is beyond the scope of this blog entry but there are a couple that are particularly interesting.

Simple Event Detection – Detecting a simple event is easy with most monitoring solutions – including OpsMgr.  The Simple Event Detection monitor is, well, simple.  I describe it here as a starting point and because it will provide some good discussion on building monitors in general that is applicable to any monitor.

From the create monitoring wizard select to create a Simple Event Detection monitor.  For our example we will use a Windows Event Reset as the type – more about that in a minute.  Make sure you choose a management pack other than the default management pack to store this monitor!

image

On the general properties screen, choose a target and parent monitor.  For our example lets assume we will be delivering additional monitoring to SQL 2005 Servers.  If the SQL 2005 Management Pack is installed we will have a target called SQL 2005 DB Engine.  Select that target.  The next choice is which Parent Monitor should ‘contain’ our unit monitor.  For monitors, there are 4 general parent monitors that may be an option for you – availability, configuration, performance and security.  You may also see one for Backwards Compatibility but that isn’t a category that you should use when authoring a monitor yourself.  These categories allow grouping of unit monitors according to the general  intended purpose.  If, for example, our monitor will be looking for events that could impact the general availability of the monitored object , it should be placed under availability.  If our monitor will be looking for events that could impact the general performance of the monitored object, it should be placed under performance.  We will discuss these categories in greater detail in part II of this post because each category is actually itself a monitor – an aggregate monitor.

image

We are building a simple event detection monitor so the next screen will ask for the event log OpsMgr should look in for the event.  We will leave the default of application but note that this could be any event log in place on the system.

image

So we’ve chosen the application log, now we have to specify the event to look for with our monitor.

image

Click next and – we are being asked again for the event log we want to use?  What’s going on here?  This is an excellent opportunity to discuss the option we first select when we started building our monitor.  We chose that we wanted a Simple Event Detection monitor and we chose that it should be a windows event reset monitor.  Remember that monitors are all about health.  Monitors detect when a healthy condition goes unhealthy and can ALSO detect when an unhealthy condition goes back to health.  That is, in fact, the holy grail of monitors – to detect when an unhealthy situation takes place and then automatically detect when the unhealthy condition goes away!  And that is exactly what is being done here.  One event, the first one with Event ID 1234, will indicate an unhealthy condition has taken place and now this second event – in the same or different event log – will indicate that the unhealthy condition has been resolved – completely automatic!  Of course, not all monitoring scenarios lend themselves to that which is why that in addition to windows event reset we also have options for timer reset and manual reset.  Timer reset is a situation where we detect the unhealthy condition and immediately start a timer.  If the unhealthy condition has not been detected again during our timing period (defined on the monitor) then we revert the health back to a healthy state.  If we detect the unhealthy condition again during the timing period, the timing period starts over.  The manual reset monitor means that once the unhealthy condition is noted it will not be reset until either manually touched or reset by some scripting method.  The manual reset monitor should not be in wide use and, when used, should be for very specific scenarios.

We will again select the Application event log.

image

And select the event criteria that will indicate monitoring is again healthy

image

Next we pull our two event ID’s together and select whether the first event will raise a warning or a critical health state (there is a drop down that will allow you to select warning or critical but you can’t see it until you click on warning to change if needed).  The second event will return us to a health status.

image

The final screen of the wizard allows selection of whether or not this monitor will generate an alert

image

Select to create and the monitor is saved.  If we go find our new monitor and select properties on it we see that there are actually additional items we can configure, such as product knowledge, diagnostic and recovery and overrides that are not part of the initial wizard.  Product knowledge allows information about the monitor and how to resolve detected problems to be recorded.  Diagnostic and Recovery allows specific steps to be configured as a response to the monitor changing state that may aid in diagnosing or fixing the problem and overrides are where you specify any conditions other than the default that should be in place for all monitored objects or a subset thereof.  It is only possible to override values that have been authored to allow overriding.

image

image

image

image

We’ve gone screen by screen for our first example to illustrate a few important key concepts that will generally apply to all monitors.  For our next examples of interesting unit monitors we won’t go screen by screen but only show the relevant screens.

WMI Performance Monitors
In some cases it would be helpful to have a performance monitor style method of collecting information about an object but there is no performance counter available.  An example of this might be file size.  Suppose you have a particular log file that needs to be monitored for an increase in size.  You look in the performance monitor counters and there is not a counter available.  What option do you have?  Certainly a script would be a workable choice but you may not be comfortable scripting.  Is there a way to take file size information and convert it to performance data that can be used in OpsMgr?  Absolutely – and there are lots of other examples too beyond the file size example.  I have previously described exactly how to setup such a scenario here so I will avoid repeating but do take a look at this example as having this ability really is powerful.

Log File Monitoring
Another monitor type that is little used but very useful is the log file monitors.  Many applications have log files they write to indicate application processing, error conditions, etc.  SQL has it’s error log and SCCM/SMS is chock full of logged information and there are lots of other examples.  Over time there will be conditions in your environment that you know to be problems that arise when specific log entries show up.  In many cases the provided SQL and/or SCCM management packs will handle the errors from tose systems for you but in cases where they don’t, being able to craft your own log monitor is useful.  And it’s not difficult.  Here are the relevant configuration screens.

On the Application Log Data Source screen we need to configure the directory where our log(s) are stored and then a pattern that will specify how to search.  Note that wildcards are supported so it would be possible to search multiple log files of similar name.

image

Next, we need to configure what information in the log file we want to detect.  In this case we are looking for the text bummer – but it could also be a string of text rather than a single word.  Also, the parameter name will be the same regardless – this is the syntax to specify we are defining the first parameter of interest which is all I’ve ever needed. 

image

The next step is to configure how long we will let the alert remain before automatically resetting it.  This configuration is a timer based monitor.  I chose this option just to illustrate another example.  Note that you could have just as easily setup an event reset monitor here – all that would be needed is to define another parameter that would flag the health condition.  Also note that my timer reset is set to an hour.  This means that if the problem condition is not detected again for an hour that the monitor will reset but, if the problem condition is detected again, the timer gets reset and another full hour will be required before considering the problem cleared up.

image

There rest of the screens are similar to what has already been seen.

Repeated Event Detection
I’ve already described how to configure to detect a simple event.  With such a configuration, when the configured event comes in it is detected and configure action, such as raising an alert, takes place.  There could be situations though where a single event may not indicate a problem.  However, if 3 of the same events happen within a 15 minute window, that may indicate a problem that we need to investigate.  Taking what we already know from the simple event monitor it’s easy to configure the repeat monitor.  Just choose the repeated event monitor, configure what event we care about and then you will get to the screen shown below to configure our repeat settings.

There are a few choices for counting mode but the one I find simple and useful is the Trigger on count, sliding.  Basically this means that when the first event shows up we will start a timer for, in this case, 15 minute.  If another of the events shows up within tat 15 minutes then we will consider this a problem and move on to take action.  You can configure the repeat count however you like by adjusting the compare count settings.  There are other options here you can explore as you like.

image

Correlated Event Monitor
A more complex but very useful monitor is the correlated event detection monitor.  Using this monitor it is possible to configure OpsMgr to watch for complex event patterns, whether in the same event log of different event logs, and alert only when the pattern specified is entered.  For our example I’ve chose an windows event reset monitor which means that OpsMgr will watch for a specific windows event to trigger the reset of health after a problem occurs.  I won’t go through configuring every event screen because it’s all similar to what we’ve already seen in the simple event discussion.  Note on the screenshot below, however, that there were 3 events to configure. The first even is the reset event while events A and B are the correlated events.  The screen below also shows options for how events A and B correlate to one another.

On the correlation screen we have a few options to discuss.  First, is the correlation interval.  Like the repeated event detection this interval specifies how long to watch for the event pattern after receiving the first event.  If the pattern doesn’t manifest in this configured time then there will be no change to health.  Also, there are multiple options to correlation settings as shown below.  The wording here may be confusing at first but the graphic shown on the configuration page will change as you move from option to option to illustrate what each option does.  Finally, note the occurrence and Expression options.  With these options you can increase the complexity of our filter by configuring how often our patter should occur as well as specific expression information that is beyond what we are discussing here.

image

image

Finally, we have the health screen that pulls this all together.  Note the event raised refers to the first event configured that will trigger a healthy condition where as match to the correlated event loging we just configured will result in a warning event (you can change to a critical if you like)

image

Part II of our discussion will turn to the aggregate rollup monitor and how it is used in OpsMgr.

Posted by steverac | 0 Comments
Filed under:

Monitoring File Size with Custom WMI Performance Counter

Have you ever needed to monitor the size of a file to ensure you were alerted when it’s size changed beyond certain values?  I had to help address this recently and at first glance, this may seem to be a good job for a script – and a script definitely could be used.  Scripts are great but I like to avoid them when I can and this is a perfect scenario for using a custom WMI performance counter.  After all, tracking a threshold is ultimately what we are after.  So how can this be done?  I wrote a sample monitor and rule to demonstrate.   Let’s walk through the monitor since both are similar – the reason for both is that a monitor will track live data but will not collect it; a rule collects performance data.

The first things is to configure the WMI query to specify which file we want to monitor and the properties.  We can do this through a very simple WMI query to the root\cimv2 namespace as shown below.  All files on a system can be accessed using this same query, Just change the filename and modify the properties being monitored if interested in something other than filesize.  For this example, I chose the pagefile.sys.

image

So we have the data through the WMI query, now we need to map this data and use it to create a custom performance counter.  We do that on the performance mapper screen as shown below.  Note that the Counter name and value name match.

image

For a rule, thats all you need to configure.  For a monitor, we need to specify the thresholds where we want to generate alerts.  This is done on the Threshold Range page.

image

We also need to map our thresholds to the appropriate health state.

image

And decide whether or not to alert.  With all of this configured, we are in business!  After our rule starts to run for a while we will see performance data start to show up for the custom counter.

image

Posted by steverac | 0 Comments
Filed under:

OpsMgr Agent Discovery Hanging after Enabling Broker Service?

When OpsMgr discovery hangs it’s because the SQL Broker service has not been enabled in the operationsmanager database.  Just a quick BING search will show you tons of hits about this issue – and I’ve never seen one case where this wasn’t the case – until now. I just finished working an issue where OpsMgr discovery would hang but the SQL Broker service was enabled.  I didn’t know much about the Broker service and definately had never had to troubleshoot it before. 

What is the SQL Broker service?  MSDN offers the following statement to describe what Broker does.  A more complete discussion can be found at http://msdn.microsoft.com/en-us/library/ms166043(SQL.90).aspx

          This new technology, a part of the Database Engine, provides a message-based communication platform that enables
          independent application components to perform as a functioning whole. Service Broker includes infrastructure for   
          asynchronous programming that can be used for applications within a single database or a single instance as well as for
          distributed applications.

So we need to troubleshoot the Broker service – cool, where can we find out information about it?  The first stop is on the Broker enabled database itself.  Under the Service Broker folder we can access various information about the service.  But I didn’t find anything here to really help me understand what might be broken.  Searching a bit I came across two very useful SQL queries – one to help me troubleshoot and one to show me data processing through the Broker.

select * from sys.transmission_queue

select * from sys.conversation_endpoints

From my limited troubleshooting it seems that when everything is running normally the first query should return no results where the second query will show data in process/queued by the system.  Looking at the results of the first query in the transmission_status column I found this error ‘An exception occurred while enqueueing a message in the target queue. Error: 15404, State: 19. Could not obtain information about Windows NT group/user 'Domain/User', error code 0x52e.’  Obviously I’ve replace the real user information with Domain\User for confidentiality reasons.  Regardless, with this information I now have a clue what is going on.  We have an account in play somewhere that we can use to authenticate.  Error 52e confirms that and translates to ERROR_LOGON_FAILURE.  Digging a bit we found the culprit.  The owner of the operationsmanager database was the account that was incorrect. It’s easy to see who the database owner is, just select the database, click properties and on the general page, the owner information is displayed.  Here is a screenshot from my lab.  Once we set the database owner to the correct value, discovery worked like a champ!

image

What ports REALLY need to be open for AMT in SCCM?

The question came up recently about what ports are really needed for AMT in SCCM.  The documentation (http://technet.microsoft.com/en-us/library/bb632618.aspx) indicates a list of ports that are used by AMT for various communication that takes place between various AMT components during management.  The documentation is correct but is also potentially confusing. 

One thing to understand is that on the client AMT function is  handled in the firmware – the operating system really is secondary.  It’s possible to fully manage a system via AMT even when it is powered down or the OS is failing to load.  Why is this important?  The Windows firewall.  Based on the documentation we provide listing the required ports a natural thought is that these ports need to be opened on the Windows firewall.  This generally isn’t the case but there is one exception and that is on the Out of Band service point (OOBSP).  The OOBSP, an SCCM site system role, listens for incoming hello messages from AMT systems on port 9971, Since this component IS part of the running operating system we do need to have port 9971 opened in the Windows firewall or hello packets will be blocked and out of band provisioning will not work.  So are the ports documented in the link above relevant?  Absolutely.  if there are firewalls in place other than the Windows client firewall between AMT components you do need to ensure the ports listed are opened and accessible.  Hope this clears up any confusion.

Posted by steverac | 0 Comments
Filed under:

SCCM Unleashed is Unleashed

A bit of a self-servng post I suppose and I know I'm a bit late in posting this (couldn't decide if I would do it or not) but SCCM Unleashed has been published and it is a very complete book and an easy read.  I served as tech reviewer for the book so you can guarantee there are no mistakes! Now please don't shatter my world and point out everything I missed!  :) 
Posted by steverac | 1 Comments

Update SQL Management Pack posted to catalog!

If you haven’t seen it, the update SQL Management Pack which resolves issues monitoring clustered SQL servers has posted to the management pack catalog.

 

http://www.microsoft.com/downloads/details.aspx?FamilyId=8C0F970E-C653-4C15-9E51-6A6CADFCA363&amp;displaylang=en&displaylang=en

Posted by steverac | 0 Comments

Understanding the R2 Process Monitoring Management Pack Templates

Management pack templates have been around since RTM of OpsMgr 2007 and R2 introduced a new one specifically for process monitoring.  This is a cool template and allows the ability to setup some useful process monitoring, but what is the wizard actually building behind the scenes, and is this what you want?  Let’s walk through an example.

In the authoring node of the OpsMgr console, right click on Management Pack Templates and select add monitoring wizard.

image

Select the Process Monitoring template and select next.

image

Title your monitor and select a management pack where it should be stored and click next.

image

Choose to monitor the notepad.exe process and select a group that will be used to populate the resulting target.  Let’s stop here for a minute.  Choose a group for targeting?  Sounds very MOM 2005ish.  Well, what really happens here is that the membership of the group you select is copied into the new target class that the wizard creates.  We will see that in a minute.  Click next.

For our demo we will accept the defaults on the next two pages.

image

image

So we finish up the wizard by clicking Create on the Summary page and thats it.  Easy enough?  Using this wizard we now have created monitoring that will run on just the SQL servers that were in my group looking for notepad.  Very easy and targeted appropriately.  So what pieces were actually created?  Let’s take a look.

First, if we review our list of Process Monitors we have the one just created and two others.  The other two will come into the discussion shortly.

image

As we explore the results of the wizard, remember that one of the items requested by the wizard is the group name.  We chose SQL 2005 Computers.  The membership of this group is as follows

image

As mentioned, this group is not the target for our monitoring but is used to seed the class created by the wizard.  If we take a look at discovered inventory and choose the class that was created we see the the class membership is exactly the same as our group.  So, essentially, the membership is a direct copy of the group.

image  One thing to bear in mind here, if an agent is listed in a the target group but that agent is having problems it may not run the discovery and, as a result, may not populate into the new target class that the wizard creates.

At this point you might be thinking that you just stumbled across a cool way to build a new target class without having to use the authoring console.  Why not just create a group that has all of the systems in it that you want to target for monitoring and use the process monitor wizard to specify the group you created as the target group and convert it to a new class that you can use for targeting something completely unrelated.  After all, you can just delete the process monitor when done and you are good to go.  Well, good thought but it fails on a couple of fronts – when the process monitor is deleted so is the target class that was created.  There is another potential problem as you will see in a minute.

OK, continuing on.  What monitoring objects were created by the wizard?  To find out just right click on the new process monitor template created in the authoring node and select to view management pack objects.  To populate the new target class we have to have an object discovery.  Click on object discoveries from the right-click menu and you will see the new discovery created.

image

From the same right-click menu select monitors to see the associated monitors that were created.

image

In addition to monitors, a couple of rules were created.  Choose rules from the right-click menu to see them.

image

Notice that all of the monitors and rules are targeted at the newly created target class Test Notepad.exe Process Monitor.

So what do we know so far?  We know that the wizard created a new target class that is populated by the newly created object discovery and we know that both rules and monitors are targeted to this new target class.  So where does this new target class fit into the overall health structure?  If we take a look in the add component menu from distributed application designer and do a bit of searching we find our new target class.

image

Interesting.  So what the wizard actually does is create our new process monitor target class as a child class of the Base Monitored Process Class which is related to Windows Local Application and so on.  So anytime our monitors detect an issue the resulting health will be reflected on Windows Local Application and on up the chain. 

image

But, wait, I’m monitoring a process on my SQL server.  What if that process were a critical piece of SQL server?  Shouldn’t a failure of that process impact the health state of SQL server?  And now we have arrived at the second issue I mentioned above.  If the process is critical to SQL server and you want the health state of SQL server to be impacted when the process fails then this wizard isn’t the answer for you.  Very likely you will need to build other monitoring to meet your need.  Further evidence of this in the diagram is the fact that I created three total process monitors, two targeted to my SQL server group and one targeted to Windows computer, yet they all reside in the same node illustrating that using this wizard will result in proper monitoring but may not meet your needs in terms of properly fitting in your health structure.  For that, creating customized monitoring is your answer.

Posted by steverac | 0 Comments
Filed under:
More Posts Next page »
 
Page view tracker