Welcome to MSDN Blogs Sign in | Join | Help

SCCM Patching and OpsMgr Maintenance Mode

Customers using OpsMgr for monitoring and SCCM for patching commonly select the option in SCCM to put MOM/OpsMgr agents in maintenance mode during patching – or even with standard software distribution.

image

This option works great if running MOM 2005 but does not operate as you might think with OpsMgr 2007.  When you set this option and SCCM attempts ‘maintenance mode’ on an OpsMgr agent the result is that the health service is paused.  While the health service is paused you may get heartbeat failure errors and when the health service is resumed, all queued up actions will process at that time resulting in potential alerts.  For this reason and others, I don’t recommend using this SCCM option for OpsMgr agents.

OK, if this isn’t an option then how should we go about putting the OpsMgr agent in maintenance mode during patching?  The approach I describe here assumes you have the OpsMgr agent on every SCCM client and are using maintenance windows in SCCM to dictate when patching is allowed to take place on SCCM agents.  With maintenance windows defined we can use that information in OpsMgr and build groups of agents and then use any number of utilities available to schedule these groups to enter maintenance mode at the appropriate time.  Lets see how this can be configured.

First, we need to get a list of configured maintenance windows in place on SCCM.  To do this, just run the query below on your SCCM database.

select * from v_servicewindow

In my lab I get the following result.

image

The two maintenance windows shown are the two that I have configured for patching specifically.  OK, so how do I find all of the SCCM clients/OpsMgr agents that have this maintenance window?  For that all we have to do is query WMI on each SCCM client/OpsMgr agent as shown.

image

If you look at the results you see that this client has one of my two patching maintenance windows defined – all you have to do is match up the GUID’s returned from WMI with the GUIDs returned from SQL – and in SQL you can see the friendly name associated with each GUID so we know which maintenance window we are dealing with.

Now that we have this information it is very easy to pull this into OpsMgr – just add an attribute to a class – such as the Windows Computer class – to query WMI and pull the maintenance windows GUID from WMI of the client and then build groups for systems with each GUID.  Once you have the groups you can use any number of tools to schedule OpsMgr maintenance mode to correspond with the SCCM maintenance windows!

Configuring SCCM and Branch Cache

Branch cache is a feature introduced with Windows 2008 R2 that allows systems within the same subnet and separated from a content source (such as a WSUS server) to share downloaded content locally rather than each system having to traverse a latent network link back to the content source.  Branch cache has two modes of operation – distributed cache mode and hosted cache mode.  SCCM only supports distributed cache mode.

As mentioned, branch cache is a function of the OS and not of SCCM but, with the release of SCCM SP2, we now are able to leverage branch cache on enabled servers.  When we find branch cache enabled we use it – there is no special SCCM configuration requirements for branch cache other than what is likely in se in most environments anyway.  We will highlight those shortly but to start a diagram of a sample configuration might help explain how this is all setup.

image

In the diagram we have 4 systems.  The SCCM SP2 site server, an SCCM distribution point being hosted on a Windows 2008 R2 server that has branch cache enabled and two client systems – one server and one workstation.  We have mentioned already that to support branch cache we need to use a Windows 2008 R2 system – but what about clients?  From the diagram note that branch cache is supported natively on Windows 2008 R2 and Windows 7 – and with the addition of BITS 4.0 clients can also take advantage of branch cache on Vista SP1 and Windows 2008 SP1/SP2. 

The diagram shows how SCCM works with branch cache.  The first time content is requested from the distribution point the client has no choice but to retrieve it across the latent network.  Once retrieved the content is stored by branch cache.  When the second client attempts to retrieve the content it detects a latent network and sends out a request for any peers with the content.  The first client responds and serves up the content thus avoiding the need to pull it across from the distribution point.

OK, so we have the general concept of how it works – so how do we set it up?  It’s fairly easy.  First, what are the requirements in SCCM to make use of branch cache.  We need our distribution point to be installed on Windows 2008 R2 and to be BITS enabled.

image

The only other requirement is that your advertisement be set to download and execute locally.

image

Both of these settings are common in most environments and are all that is required on the SCCM side.  There is no selection in SCCM to ‘enable’ branch cache – with the settings above it just happens. 

So how do we configure branch cache on the OS side?  Just a few settings to configure.  First we need to enable branch cache – let’s start with the distribution point.  In Server Manager, select features and add the branch cache feature.  There is no configuration here – the feature is either on or off.  Once installed you will see it in the feature list.

image

Next, we have to configure branch cache on our clients.  This can be done through local policy on each system or – much easier – through domain policy.  On my domain controller I open the group policy management console.  In my lab I created an OU for just the systems I want to enable for branch cache and I created a GPO to enable branch cache.

image

Editing the properties of the GPO we have two places to configure.  First, is branch cache policy.  Here we enable branch cache, set distributed cache mode, define the speed that we consider to be a latent network (default 80 ms) and configure the percentage of disk space to allocate to branch cache activity.

image

image

image

image

image

Note that the one option we did not configure was to enable hosted cache mode.  Why?  Remember, SCCM only supports distributed cache mode so, for SCCM, there is no need to enabled hosted cache mode. 

A word about hosted cache mode.  The OS documentation that discusses branch cache discusses both modes of operation and discuses steos to configure SMB file transfers through branch cache.  To me, the documentation is a bit fuzz.  From what I can tell, configuring branch cache to support SMB requires some additional configuration on the content server – such as enabling a role service called ‘branch cache for network files’ and configuring specific shares to support branch cache using the ‘share and storage management tool’.  Since SCCM uses branch cache via BITS there is no need to do this additional configuration to support SMB.

In addition to the above we must configure the windows firewall to allow branch cache specific communication as follows:

image

image

Both the inbound and outbound firewall rules are accessible on the predefined list – so configuring them is as easy as selecting from the drop down menu when configuring a new rule.

image

OK, so all of this is configured – how do we know if it’s working?  To test, I setup a package and advertisement.  When testing it’s important to understand that once the package is downloaded and cached, you have to either modify the content or setup a new package because unless there is a change, clients will always pull the content from the cache after the first download due to the latent network conditions.

For the test, I start with a fresh package.  The client that will download the content first will have to go to the distribution point – nothing will come from the cache since no content has yet been downloaded.

image

With all of this I’m ready to start the download but to simulate a latent network in my lab I must first start a tool that will ‘inject’ latency into the network connection between the distribution point and my first client.  I’ll use a tool we have internally to introduce a 30 ms latency – I will maintain this latency throughout my testing.  Note that my policy settings above consider any network more latent than 20 ms to be slow.

image

With all of this configured, I start my advertisement which initiates a download of content.

image

At the end of the download I run the advertisement and then move on to my second machine.  This second machine is where we expect to see branch cache in action.

I prepare to run the same advertisement on the second machine – note you can tell that this advertisement has never been run before on my system – which is important when testing branch cache.

image

I’m about to hit run but how can I tell whether branch cache actually fires up?  Performance monitor.  With performance monitor you can determine  whether content is being transferred from the local branch cache or from a branch cache partner vs. the network.  The chart below shows a chart of perfmon counters that you get from the branch cache object.  The two most interesting are the Bytes from cache and Bytes from server.  If content is being retrieved from either a systems local branch cache or a peer systems branch cache, that counter will show activity.  In our test case and on the second system, no content for this advertisement has ever been downloaded before so we expect content to be retrieved from our peer client that is branch cache enabled.

image

We initiate the advertisement as before and downloading begins. 

image

We allow the download to finish, complete the advertisement and then look back at our performance counter numbers to see what happened.

image

By looking at our counters we can see that we did pull the content from our peer caching partner – so branch cache worked.

Branch cache is just one more option available to administrators to help deliver content to clients efficiently and reduce load on the network.  It’s not difficult to setup and offers notable performance advantage – defiantly worth taking a look.

Presenting at MMS 2010

MMS 2010 is quickly approaching – if you haven’t registered yet, don’t wait too long!  This year I will be presenting two sessions at MMS

SCCM – Configuration Manager and SQL Server Reporting Services – custom Reports ROCK!

OpsMgr – Troubleshooting Using the Operations Manager 2007 Resource kit Tools

Should be fun and hope to see you all there!

SCCM and SQL Reporting Services

With the R2 release, SCCM now supports the use of SQL Reporting Services as a reporting solution.  This will be the direction for reporting going forward and offers a good number of exciting possibilities for building impactful custom reports.  In the January edition of Technet magazine I publish an article discussing this feature and showing a sample method for building custom reports - Create a Robust, Integrated Reporting Solution

Posted by steverac | 0 Comments

Three Cheers for the Authoring Resource Kit Tools! Part III – The Cookdown Analyzer

For the third and final post in the Authoring Resource Kit Tools series we introduce the Cookdown Analyzer.  Before a discussion of the analyzer makes sense a brief discussion of the cookdown process itself is in order.

In my previous to posts I discussed the Workflow Analyzer - available here - and the Workflow Simulator - available here.  Both of those tools are useful across a broad spectrum of scenarios.  The cookdown scenario is specifically focused on workflows that use scripts and target classes with multiple objects.  A good example of this would be targeting the logical disk class where a single system may have multiple disk drives that need to be analyzed.  Classes like this are potentially tricky even for non-script monitoring because the way you normally think about targeting a monitor or rule may not be appropriate for a class containing multiple instances.  If you don’t target these classes correctly you can end up with monitoring that constantly flips between healthy and unhealthy - causing churn, inaccurate results and overhead. Kevin has a good blog post discussing how to properly target in such scenarios available here.  When you add a script into the mix it is very easy to break cookdown.  OK, let’s stop here – so what the heck is cookdown?

In OpsMgr, cookdown is a process that specifically applies to scripts targeted at classes with multiple instances.  Think of a system with 10 hard disks.  If you want to target a script to monitor these disks you would likely end up running one instance of the script per hard disk on the system – or 10 instances of the script executing at once - just to get the job done.  Script execution is the most labor intensive way to run a workflow so when we have to use scripts we want to ensure that we run as few instances as possible.  Cookdown helps optimize scripts in these scenarios.  When written properly  cookdown will analyze the workflow and allow the script (the data source for the workflow) to only execute one time (not one time per disk, one time total) and supply resulting data to the 10 workflows.  Yes, there will still be one workflow per disk but only one instance of the script execution that feeds information to all 10 workflows.  This results in a significant improvement in workflow processing.  I’ll avoid further discussion of cookdown itself as it is a topic beyond the scope of this blog post but if you find yourself writing scripts in scenarios targeting classes with multiple instances, cookdown is something you definitely want to understand and to use!

That said, let’s take a look at the cookdown analyzer tool.  This tool is available in the authoring console and runs against the management pack as a whole with the ability to exclude classes with just a few instances.  Those classes that only have single instances are excluded automatically.  We will look at the logical disk free space monitor from the windows 2008 management pack and compare it against a customized version created to address some specific monitoring needs. 

To get started, load up the Windows 2008 monitoring management pack in the authoring console and from the tools menu, select perform cookdown analysis.  This will bring up the screen below. 

image

Note the text on the right that discusses single-instance classes being excluded and also the ability to choose which multi-instance classes to include in the analysis.  For our sample analysis I’m only interested in the logical disk class so I exclude all the others.  To start the analysis select generate report and you will see the report below.

image

I’m particularly interested in the logical disk free space monitor since that is the one I’ve customized so scrolling down the report I find that monitor and see the results – shown below.

image

OK, well and good – the default logical disk free space monitor passes the cookdown analysis without problem (I would hope so, it is one that we built after all!)

But what about the custom monitor that was built to modify logical disk free space monitoring?  Before I show the analysis of that monitor – a brief comment.  If you need to customize a default rule/monitor/discovery the guidance is to override the default monitor and disable it and recreate that monitor in a custom management pack.  OK, a bit painful but not that big of a deal….at least for most situations.  The logical disk free space monitor is one of the exceptions because this particular monitor cannot be built in the opsmgr UI – and also cannot be built using the authoring console.  Instead, this monitor is built directly in XML – so to truly replicate the monitor and all of the moving parts requires XML diving.  XML diving actually isn’t that bad – if you want to understand how all of the pieces of XML fit together, take a look at my very details (took a LOT of work) blog post available here.  OK, on with the analysis. 

In the customized example, the author didn’t want to mess with the XML so did the work as best possible in the UI – and thats cool as long as the end result functions as needed.  The result works but fails cookdown analysis as shown.  Note that some identifying information has been omitted from the screen snap – but nothing relevant to the analysis.

image

Why does the customized version fail cookdown while the default does not?  We see similar variables being used for each.  We do note that different modules are listed as part of the workflow but thats not surprising since the default example was built in the UI. 

The key problem here is the text in red (you can get more detail by clicking on the red or green text).  If you compare the first and second examples you see that the same value of $Target/Property/Type….$ is used in each but the positioning is different.  OK Steve – BRILLIANT analysis!  So whats causing this to be red?  One of THE KEY requirements to avoid breaking cookdown is to NOT pass any specific instance into the script being executed.  Rather, let the script handle doing the discovery of the different instances internally as part of processing.  I can hear you asking – isn’t this a second discovery – kind of, but not really.  Doing it this way allows a single iteration of the script to operate against all instances of interest (in this case, disk drives).  So, if we look at the properties of the monitor and, specifically, the parameters being passed to the script we see that the specific device ID (disk drive) is being passed to the script explicitly – causing cookdown to break. 

image

If we simply remove the argument and adjust the script internally to handle the disk drive instances, the problems with cookdown go away.  Note that the screenshot below was taken after removing the offending argument – the script was not adjusted

image

There you have it – the cookdown analyzer is really useful in specific scenarios – and it’s a good idea to run it against your custom MP’s – just to be sure. 

To wrap up let me state again - this blog was not intended to be a deep dive into the internals of cookdown but, rather, a demonstration of using the cookdown analyzer to find problems with cookdown in custom management packs and help demonstrate how to use that information to fix these issues which can cause notable overhead and churn in an environment – particularly those with a large instance space.

Posted by steverac | 0 Comments
Filed under:

Three Cheers for the Authoring Resource Kit Tools! Part II – The Workflow Simulator

You’ve just built out several custom rules/monitors/discoveries – it’s late, you think you are almost done – just a bit of testing to go and….they don’t work.  You look them over again and don’t see anything wrong – wouldn’t it be really cool if there was a way to see the ‘internals’ that are happening when the workflow runs?  Introducing – the workflow simulator which is part of the Authoring Resource Kit Tools available here.

Remember in MOM 2005 days when we had the ability to configure script tracing to actually watch the execution of a script running on an agent?  You could also cause the script to open in a debugger to walk through execution line by line.  The only caveat is that you had to have the script running through the agent.  The workflow simulator will do all of the stuff MOM 2005 did and more.  Essentially, any workflow that you have configured can be executed in the simulator without any need to have the workflow actually deployed to the agent.  One requirement – you do have to have the agent installed on the system where you will be using the simulator so that the required binaries are present.

So where is this simulator and how is it used?  Once you have the authoring console resource kit tools installed the simulator is available in the authoring console itself.  We will take a look at the simulator and walk through using it with three sample discoveries that in a custom BackITupNOW! management pack that I created when authoring the targeting chapter in the upcoming OpsMgr R2 Unleashed ebook.  We will start with a simple registry discovery and use that to also talk about the configuration of the simulator and then move on to a WMI discovery and finally a simple script based discovery.  For each example I will show a working discovery and then show the results with the same discovery that doesn’t return data.

Workflow Simulator
We start by opening the BackITupNOW! management pack in the authoring console and then navigating to the discoveries node and our three sample discoveries as shown.

image

OK, so where is the simulator!  It’s a bit hidden but if you simply right-click on any of the three discoveries you will note the simulate option in the displayed menu.  If an authoring console element doesn’t support simulation then the option will appear grey.

image

Selecting ‘Simulate’ launches the simulator tied to the specific workflow from which the simulator was launched – as shown.

image

I’ve expanded several sections of the simulator to show the various configuration options.  Let’s walk through the specific sections.  The first section displays the name of the workflow and it’s target – no configuration to be done here – the fields won’t allow editing.

image 
Next, the Target Expressions options.  There are a couple of settings we can tap into here.  First, note whether there is a green check mark or a yellow exclamation mark here.  If the yellow exclamation is seen that means some of the variables/values required by the workflow cannot be resolved and either need to be configured manually or, if the workflow in question has been imported to your management group, you can select to connect to the RMS and resolve the values.

image

I resolved my expressions from my RMS.  Doing so presents the dialog below allowing connection information to be specified.

image

If no RMS is available to auto-resolve the variables then it’s easy enough to resolve them manually, either by typing in a value or allowing the simulator to auto-generate a GUID for fields that require them.  Remember that this is a simulation – the results are accurate but the data doesn’t have to be accurate (such as with a GUID) – it just needs to be in the correct format and enough to allow required workflow have values that will work.

image

The next field is the override values.  The options here will vary depending on the workflow but for the simulation you might consider changing values such as frequency, etc. to allow the workflow to run more quickly or have a different timeout, etc.

image

With all of the above configured you are almost ready to start the simulation.  First though you need to decide whether to resolve any $MPElement/…$ expressions (I always leave this option selected) and whether to debug scripts.  The debug script option only works when running a workflow that contains a script and will also only work if you have a script debugger registered.  A good simple script debugger is the Microsoft Script Debugger – you can download it here and I will show it in action when we get to our script based discovery example.

With these options configured, start the simulation.  Once you get the simulation started and the first elements of the simulation appear we have a few additional options we can configure.  If you right-click on a module you will see additional options.  I tend to choose to enable tracing for the whole workflow which will launch the workflow analyzer when the simulation is running so you can see even more detail (I discuss the workflow analyzer in part I of this series – available here).  You will also note that XML output is available for review from each of the running modules.  By reviewing the XML output of the simulator and the workflow analyzer together you can generally put together whether the workflow is running as expected or not and the reasons why. 

image

Registry Discovery - Good
As mentioned earlier, i will show both a good and a bad simulation for each of my three discovery workflows.  Let’s look at the good simulation for my registry discovery.  First, let’s take a look at the configuration of the registry discovery.  As shown below, we are looking for 4 registry values – Device, GroupName, InstallDate and InstallDirectory.  We are specifically trying to find systems that have the following values for these entries

Device – Tape
GroupName – Group
InstallDate – 09
Installdirectory – c:\backITupNOW

If the discovery doesn’t find these values it will not return a match. 

image

image

image

Running this workflow through the simulator we get the output below.  the first module, the schedule doesn’t tell us much except which healthserviceid we are operating against and the time when the schedule fired – which cold be useful if trying to diagnose a workflow that is not operating on time.

image

The probe module shows the attempt to read the registry and the values it found.  This module maps to the registry probe configuration settings configured on the discovery.

image

The same XML data is seen for the filter module but here we are evaluating to ensure we have a match – this section maps to the expression settings configured on the discovery.

image

Finally, the mapper module pulls it all together and takes the discovered data, which passed our filter, and submits it as discovery data.  The screens below show the total XML and are modified a bit to get as much of the XML in the display as possible.

image

image

So the simulator has given us a great deal of good information.  Now add the data from the workflow analyzer and the detail is even richer. I won’t add much since the data speaks for itself but note that for each registry value we can see what is determined to be a match.

image

image

Registry Discovery - Bad
So that was a working discovery – now lets change just a single value in the registry to other than what is expected and see the difference.  Notice I just changed the install dir from c:\ to d:\

image

Run the simulator again – notice the probe information is the same but the filter does not generate any data since we don’t have a match and there is no discovery data returned in the mapper.

image

image

image

And from the workflow analyzer we can see the mismatch and all subsequent attempts to match stop.

image

WMI Discovery – Good
We’ve seen the registry discovery – what about a WMI discovery.  Here is the detail of what happens in the simulator and workflow analyzer. 

The discovery will only match if the countrycode value is equal to 1.  This likely is not a value you would use in a real discovery but it allows an easy demonstration of a good vs. bad WMI discovery.

image
In the analyzer we only have three modules that return data – the first is our scheduler module.

image
The probe module actually reads the WMI class and shows all of the associated data.  Note that country code does equal 1.

image

image

image
The mapper module shows the discovery data being submitted.

image

The same results in the workflow analyzer are not as detailed as for the registry discovery but one key item we can see is that one data item is listed as returned as part of the workflow.  That means discovery was successful and data was submitted.
image

WMI Discovery – Bad
Now that we’ve seen a sample of a good WMI discovery let’s run one that won’t return discovery data.  To do that I simply change the country code to a value of 2 so I will get no match.

image

I still have the same three modules that display but note for the probe that I have no data returned and for the mapper I return no class instance for my discovery.
image

image

I also see in my analyzer that there are no data items returned meaning the discovery was not successful.  Actually, a word on that for a minute.  In the registry example and in this example – and in the next example, I refer to discoveries not being successful.  Thats not really true – the discovery is always successful meaning that it does run and look to see if the system matches the discovery criteria – if it does we return a data item and if it doesn’t we return nothing.  So the discovery does, in fact, work – but just returns no data.  Just wanted to clear up that potential confusion.
image

Script Discovery – Good
We’ve seen registry and WMI discoveries – now let’s look at a script discovery.  Notice the yellow highlight in the script.  When the script runs it will specifically look for a FOLDER called flagfile.txt.  if it doesn’t find a folder by this name, the script simply exits.
 image

Running this through the analyzer we see two modules – our familiar scheduler module and the script module.  You can see in the XML for the script module that the script does run and does return data meaning the discovery was successful.
image

image

Looking in the analyzer we can get even more useful information – such as the command line used for the script, the XML blog containing discovery information that is submitted, etc.
image

Script Discovery – Bad (with script debug enabled)
To give an example of a discovery that doesn’t submit data, I simply delete the flagfile.txt folder and rerun the simulation.  Note that this time I selected the option to debug the script.  This is very useful if you are seeing problems with the script where you expect data to be returned but it isn’t, etc.  In the simulator I see my scheduler module but there is no script module – since nothing is returned from the script no data is submitted.
 image

My script does attempt to run because my debugger pops up.  I trace the script execution to the highlighted line and then the script simply exits.  Why?  Because there is no folder named flagfile.txt.
image

Looking at the analyzer I see that a script error is encountered showing data loss but no message is displayed.  This may be misleading since no error really occurred – my script simply exited because a condition wasn’t met.
image

And there you have it – a brief walk through of the workflow simulator.  I find this to be an immensely useful tool.  In the examples we used discoveries from a custom and unsealed management pack – but the simulator works just fine with workflows from sealed management packs too.  Note also that there are some limitations to the simulator so be sure to check out the help file documentation and review them.

Posted by steverac | 0 Comments
Filed under:

Three Cheers for the Authoring Resource Kit Tools! Part I - The Workflow Analyzer

Have you ever been faced with troubleshooting OpsMgr and needed a way to trace the flow of a particular component – maybe a rule or monitor – or a discovery – to see what it was actually doing and what information was being submitted?  Have you gotten frustrated with the ETL logs and the native OpsMgr events?  Join the crowd!  :) 

I’m not saying that ETL or the OpsMgr event logs are bad – quite the contrary…there is a great deal of information in both – and in a troubleshooting scenario it’s often helpful to take logs or traces from a healthy agent and compare them to a broken agent – thats troubleshooting 101.  In addition, the ETL tracing and event logging abilities of OpsMgr have just gotten better as we’ve progressed from RTM to R2! 

But, what if we need that extra bit of deeper understand of whats happening ‘behind the scenes’?  Enter the Authoring Resource Kit Tools – available here.  The list of tools includes an MP Spell Checker, the cookdown analyzer, a workflow analyzer, a workflow simulator, a Visio add in for MP visualization, an MP difference tool, an updated Best Practice Analyser, etc. 

In this series of blog posts I’ll highlight three of the tools that are of particular interest to me – the workflow analyzer, the workflow simulator and the cookdown analyzer.

Workflow Analyzer
As mentioned above, ETL tracing is available in OpsMgr and can be used to solve/diagnose many issues – and the ETL traces can be quite detailed – which is also their challenge!  Because these traces can be so busy and detailed it can take some time to get comfortable reading/interpreting them.  Additionally challenging is the fact that different levels of tracing can be configured as well as different output formats.  Wouldn’t it be cool if you could pick out the workflow of interest and focus tracing on that component only?  The Workflow Analyzer does just that! 

Launching the Workflow Analyzer requires two inputs – the name of your RMS and the health service you want to analyze.

image

Note that the analyzer can be run on the RMS itself – or it can be configured to analyze an agent workflow.  If you run the trace tool on the RMS, start a new analysis session and choose the RMS health service, tracing starts immediately since the workflow of interest is running on the RMS.  If you launch the Analyzer on the RMS and choose a remote health service then all of the configurations will be made to start tracing on the remote health service but to actually see the tracing output you will need to launch another instance of the Analyzer on the remote workstation and select to ‘connect to an existing Workflow Analysis’ as shown.

image

When a new analysis session is started and the RMS/target health service are selected, a list of all of the workflows that the target health service knows about will be listed.  Those that are running will be shown in green, those that are not running will be grey and those with a problem condition will be shown in red.  The status column will give the state of each workflow.  As you can imagine, showing an example of every conceivable workflow would be a big undertaking – but we will show a couple of samples to demonstrate the power of the analyzer.

clip_image002

Discovery Workflow
In the example above we select a discovery workflow.  Right-clicking on the workflow gives two options – trace and analysis.  The analysis option gives a detail screen with relevant information about the workflow and any configurations in place – such as overrides applied, the MP storing the workflow, etc.

clip_image002[9]

The Trace option begins a detailed trace of all actions taken by the chosen workflow.  From the screenshot below note that before tracing begins an override has to be configured on the workflow to allow tracing to start.  When tracing is selected on a workflow the override is introduced in a management pack called the WorkflowTraceOverrideMP (which only exists for the duration of the tracing) that can be seen in the Health Service State\Management Packs folder on the agent where tracing is taking place. If you catch the WorkflowTraceOverrideMP during tracing and open it you will see the simple XML introduced to apply the tracing override.  I show a sample of the WorkflowTraceOverrideMP below.  Note that if you open this MP directly the XML will likely not be formatted nicely.  This is fairly easy to fix since the MP is so simple.

image

The use of an override to initiate tracing is important to understand because before any testing can be traced the override must make it down to the target health service.  This override will come down as part of a standard configuration update.  If your environment is experiencing delayed configuration updates, getting the override down to the target health service may also be delayed.  Make sure you see an event 1210 indicating that the new configuration has become active.

image

Once the event 1210 has been received, begin reproducing the activity you wish to trace.  In the case of our discovery workflow I introduced an override on it temporarily so that is runs every minute – which makes capturing the workflow activity much quicker.  The trace output from the workflow is below.

clip_image002[11]

Note the detail received as the workflow runs – first we see a WMI query being executed along with the output from the query being sent as a dataitem.  If we want to see a particular line of data in more detail, just double-click on it and the XML representation of the data will be displayed as shown.

clip_image002[13]

clip_image002[15]

Event Workflow 
Another common example might be tracing a workflow that should match and act on an event.  The trace output below is a simple example of the kind of tracing you would see from such a workflow.  Note that as the trace proceeds and events are seen we specifically call out whether we consider each event a match or not.

clip_image002[17]

I’ll stop here – and I know there is a limited number of examples – but take some time to get familiar with this tool – it can really help understand how things are working inside of a workflow.

Posted by steverac | 0 Comments
Filed under:

“Ccmsetup is being restarted due to an administrative action. Installation files will be reset and downloaded again”

Ever see this statement in the ccmsetup log?  Ever wonder what it means?  When you run CCMSetup again after a failed install there will be a registry entry in place to flag that the previous install was a failure.  It is located at HKLM/Software/Microsoft/ccmsetup.repair.  When this key is present we know the previous install was a failure and restart setup from the beginning, including redownloading all files.  Didn’t see much documentation on this and I found it interesting so thought I’d post it out.

Posted by steverac | 0 Comments

Creating Rules and Monitors with a Schedule/Understanding XML internals!

The ability to build rules and monitors that have an associated schedule for operation was part of MOM 2005 and easy to implement.  As several other blog posts have pointed out, it is also possible to introduce such a schedule for rules and monitors in OpsMgr 2007. 

Rules - http://blogs.msdn.com/boris_yanushpolsky/archive/2008/09/19/configuring-rules-to-run-during-business-hours-only.aspx
Monitors - http://nocentdocent.wordpress.com/2009/01/20/running-a-monitor-during-business-hours/

The challenge with OpsMgr 2007 has been that it is not possible to introduce such a schedule directly on a rule or monitor in a sealed management pack – and there is no UI component in the OpsMgr console that exposes this functionality.  If there is a default rule or monitor where an operational schedule is desired the only option is to disable the rule or monitor and recreate it in a custom management pack with the desired schedule. 

Although there are other posts that address this topic none of them go in depth to explain the various elements that are part of a rule and monitor to create a schedule and none I’ve seen attempt to relate the authoring console back to the underlying XML.  A full understanding of how this works requires understanding both the authoring console components and the XML generated by the authoring console.  The goal of this post then will be to present the graphical means of creating a schedule while pulling in the XML to show how everything links together.  That said, let’s get started.

If you haven’t worked with the authoring console in R2 yet – get started.  It is a very flexible tool for authoring and I can think of very little need or argument that would justify continuing to author in the OpsMgr UI <OK, off soap box>…..

For our examples we will create 2 rules and 2 monitors. 

Rule 1 – Standard event rule with no schedule
Rule 2 – Standard event rule with a schedule
Monitor 1 – Standard event monitor with no schedule
Monitor 2 – Standard event monitor with a schedule

We will use standard event rules/monitors because they are common and easy to create.  The general principles presented here, however, apply to all of the various rule and monitor options. 

To get started, we launch the authoring console.

clip_image002
You will notice that there isn’t much that can be done with the authoring console until either a new management pack is created or an existing management pack is opened.  We will create a new one.  To do so, select File > New.  Supply an identify for the management pack, which will be it’s filename, select to create an empty management pack and click next.

clip_image004
Enter a display name for the management pack – this will be the name of the management pack as seen in the OpsMgr UI.  Once entered, select empty management pack and then select create.  This will create the new management pack and make it ready for editing. 

clip_image006
Creating the management pack should have placed the focus of the authoring console on the ‘Health Model’ node.  If it didn’t, select that now.  Before continuing let’s make one adjustment to the authoring console that will help us keep track of management pack versions.  From the menu select Tools > Options. This will bring up the screen below.  Select to ‘auto-increment’ the management pack version.  When importing management packs with changes you should ensure the version number reflects that ther is a revision but it’s easy to forget to do so.  Setting this option in the authoring console ensures you don’t forget!

clip_image008
OK, on to authoring.  We will start with rules.  In the Health Model pane click on rules and then in the center rules pane, right-click and select New > Alerting > Windows Event

clip_image010
On the general tab of the Windows Event wizard that comes up, supply the element ID – which is the internal ID of the event, the Display Name – which will be displayed in the OpsMgr UI, the target and the category.  Click net to proceed.

For this example I’ve chosen the Windows Server Operating System class as my target.  This class may not always be appropriate – make sure you understand how to properly target.  I have written an article in Technet magazine that describes targeting – available here – and have also written a chapter on targeting in the upcoming OpsMgr R2 Unleashed ebook.

clip_image012
On the Event Log Type page of the wizard, choose which event log contains the event of interest.  For our example we will use the application log – select it and click next.

clip_image014
On the Build Event Expression page of he wizard we configure the event of interest.  The default options are to choose an Event ID and also an Event Source.  For our example all we need is the Event ID so delete the line for Event Source, enter 1000 for the Event ID and click next.

clip_image016
The final page of the wizard allows configuration of alerts.  Input an alert name and choose finish.
clip_image018
With this complete we now have a simple event rule that will operating 24/7 to scan for an event 1000 on all Windows servers. 

Next we need to create our rule that will be modified to run on a schedule.  To do so, work back through the steps above but this time use different values as follows:

Element ID:  Custom.Event.Rule_Monitor.with.Schedule.Standard.Event.Rule.with.Alert.WITH.Schedule
Display Name:  Standard Event Rule with Alert – WITH schedule
Target:  Microsoft.Windows.Server.OperatingSystem
Category:  Alert
Log Name:  Application
Event ID:  1001
Alert Name:  Simple Alerting Event Rule - With Schedule

After completing your second rule the Authoring Console should appear as follows:
clip_image020
Now, let’s modify the second rule to only be active during certain times of the day.  To do so, select the second rule, right-click and select properties and then select the module tab.

In case it’s not obvious from the screen shot – note that there are no tabs to allow editing the event log, event ID or alert properties.  What the wizard did was create MODULES and plug them into the rule where required.  The section of the wizard where we configured to use the application log and to look for event 1001 (remember, we are looking at the properties of rule 2) actually created a data source (DS) module leveraging the built in Microsoft.Windows.EventProvider component.  There are lots of other Data Sources but the one selected is the one needed to pull information from the event log.  If you were to edit the Data Sources section you could edit these settings.  Further, creating the alert in the wizard caused an Actions module to get created and plugged in as shown leveraging the built in System.Health.GenerateAlert component. 

clip_image022
Notably vacant is the condition detection section.  This is the section that allows us to leverage the schedule based component of OpsMgr.  Let’s create it.  On the condition detection selection select Create.  This action pulls up a list of condition detection components available.  There is a long list and all of these should be self-explanatory.

clip_image024
The one we are interested in is the scheduler so put in schedule in the look for box, select the only entry that remains in the list and in the Module ID section label this as CD (short for condition detection – you can use whatever you want in this box but CD is common). Select Ok when finished.

clip_image026
With this complete our modules screen now has a condition detection.  Now we have to edit our condition detection to specify it’s settings.  Click on edit under the condition detection section.
clip_image028

The edit button brings up the configuration screen as shown.  From here, select configure.

clip_image030
The configure page is where the schedule options are set.  The schedule shown means that this event rule will only be PROCESSED from 6:00 AM to 11:00 PM on Monday, Wednesdays and Fridays. 

I capitalize PROCESSED for a reason.  When we create a schedule it’s natural to think that the rule or monitor is being turned off outside of these time frames.  This is not correct.  What we are creating is a condition detection module which means that the event will still be detected 24/7 on the agent and sent to the condition detection module.  Based on our settings and assuming  the event is picked up outside our schedule, the module will return that the data should not be handled and discard it – but the workflow will still fire up initially!  A minor distinction but one worth understanding.

Note also that the wizard allows for choosing times that the rule should be processed as well as when it should not.

clip_image032

There is one more thing we need to do for our scheduled rule in order for it to work.  Boris points this out on his blog which is where I discovered it!  There is a bug in the authoring console where a needed snip of XML is not added.  Back at the main schedulefilter window, select edit to pull up the XML section we need to modify.

clip_image030
After setting the schedule and clicking edit, the XML section that shows up for editing will be as follows. 

<Configuration p1:noNamespaceSchemaLocation="C:\Documents and Settings\Administrator.STARTREKNG\Local Settings\Temp\1\CD - System.SchedulerFilter.xsd" xmlns:p1="http://www.w3.org/2001/XMLSchema-instance">
  <SchedulerFilter>
    <ProcessDataMode>OnSchedule</ProcessDataMode>
    <Schedule>
      <WeeklySchedule>
        <Windows>
          <Daily>
            <Start>06:00</Start>
            <End>23:00</End>
            <DaysOfWeekMask>42</DaysOfWeekMask>
          </Daily>
        </Windows>
      </WeeklySchedule>
      <ExcludeDates></ExcludeDates>
    </Schedule>
    <TimeXPathQuery>TimeXPathQuery</TimeXPathQuery>
  </SchedulerFilter>
</Configuration>

All that is needed is a simple change, replace TimeXPathQuery with UseCurrentTime as shown.  Note that editing in XML for use in the auth console is case sensitive.

<Configuration p1:noNamespaceSchemaLocation="C:\Documents and Settings\Administrator.STARTREKNG\Local Settings\Temp\1\CD - System.SchedulerFilter.xsd" xmlns:p1="http://www.w3.org/2001/XMLSchema-instance">
  <SchedulerFilter>
    <ProcessDataMode>OnSchedule</ProcessDataMode>
    <Schedule>
      <WeeklySchedule>
        <Windows>
          <Daily>
            <Start>06:00</Start>
            <End>23:00</End>
            <DaysOfWeekMask>42</DaysOfWeekMask>
          </Daily>
        </Windows>
      </WeeklySchedule>
      <ExcludeDates></ExcludeDates>
    </Schedule>
    <UseCurrentTime>true</UseCurrentTime>
  </SchedulerFilter>
</Configuration>

You only need to change this section for the RULE – not the monitor!

Save the XML section – if there are errors you should get a warning. 

With our edits complete, select OK back to the main authoring console screen and save the current version of our management pack.  Now that we’ve built the schedule in the UI it’s time to look at the resulting XML.  It’s not that bad so don’t be nervous!

With the management pack saved, open it in your favorite XML editor – I use Notepad++.  This is a simple management pack – just two rules so far.  When you open the XML it will likely not be collapsed – to me it always helps to start with the XML collapsed to it’s major sections and start to drill in from there.  Depending on the editor, you may not be able to do this.  The collapsed view in Notepad++ makes our ‘scary XML’ management pack look quite simple – and it really is.

clip_image034

The two rules we are interested in are in the Monitoring Section.  Lets expand just the monitoring section and take a look at our two rules one at a time.  The first thing we need to understand is that every rule and monitor MUST have a data source.  It may not be obvious in the case of monitors, but there is always a datasource, 

The XML below is for rule 1.  It is a bit busy at first glance but the goal of this diagram is to demonstrate how the UI elements (both OpsMgr UI and Authoring Console UI) are represented in XML.  I also include the expanded DisplayStrings section of the XML.  If searching the XML for the name of a UI element (rule/monitor/view/task/etc) you likely will first find a hit in the display strings section and will need to track back from here.  Each UI element is represented by a unique ID that does not appear in the console.  The DisplayStrings section ties together the console name and the element ID name.  Once you discover the element ID name, use it for further searching and you will find the UI element of interest.

clip_image036
In the example above the initial line of the Rule definition runs off the screen as indicated by the line trying to correlate the UI target element to where target is shown in XML.  The full line is below for reference.

<Rule ID="Custom.Event.Rule_Monitor.with.Schedule.Standard.Event.Rule.with.Alert.NO.Schedule" Enabled="true" Target="Windows!Microsoft.Windows.Server.OperatingSystem" ConfirmDelivery="true" Remotable="true" Priority="Normal" DiscardLevel="100">

OK, so hopefully that helps explain how the XML fits in to the picture – but that was just rule 1.  We haven’t seen the XML for rule 2 yet – our scheduled rule.  The only thing different about rule 2 is that it adds a condition detection section.  The XML is below – since we have so many elements in common between rule 1 and 2 I’ve avoided commenting the common elements again and instead highlight only the schedule specific elements.

image
So completes our example with rules – hopefully you have a good understanding about the structure of the XML from this example.  On to monitors.  Building the monitors is also easy but there are more complexities both in the UI and in the XML.  Like our rule example, lets start by building a simple event monitor in the UI that operates 24/7.  In the authoring console, select the Health Model node, select monitors and in the center monitors section, right-click and select new > Windows Events > Simple > Event Reset as shown

clip_image040
On the general tab enter values for the monitor element ID, the display name, choose a target – Windows Server Operating System in this example, choose System.Health.AvailabilityState as the parent monitor and then choose a category for the monitor – Availability Health in this example, and click next.

clip_image042 On the next two screens, configure an event log and event for our unhealthy event – in this case the application log and event 1003 and click next.

clip_image044

clip_image046
On the next two screens, configure the event log and event that will trigger state back to healthy.  In this case, the application log and event 1004 and click finish.
clip_image048

clip_image050
We stated that this should be an alerting monitor – and there was no UI to configure alerting so once the monitor is saved go into properties of the monitor, select the alerting tab and configure the monitor to generate alerts and to generate an alert when the monitor is in the warning health state.  Once complete, select OK to close the monitor.

clip_image052

With this complete we have a simple event monitor that will operate 24/7 and detect an event 1003 (Unhealth) and 1004 (Healthy), adjusting state accordingly.

Next we need to create our monitor that will be modified to run on a schedule.  To do so, work back through the steps above but this time use different values as follows:

Element ID:Custom.Event.Rule_Monitor.with.Schedule.Standard.Event.Monitor.with.Alert.WITH.Schedule
Display Name:  Standard Event MOnitor with Alert WITH Schedule
Target:  Microsoft.Windows.Server.OperatingSystem
Parent Monitor:  System.Health.AvailabilityState
Category:  AvailabilityHealth
Unhealthy Event Log:  Application
Unhealthy Event:  1005
Health Event log:  Application
Healthy Event:  1006

Once complete, remember to go back to properties and configure the new scheduled monitor for alerting on warning state.  Once the two monitors are built the authoring console UI should appear as follows:

clip_image054

OK – now lets add our schedule to the second monitor – are you excited?  :)  Remember earlier that I mentioned that every rule and monitor has a data source and that the data source can vary depending on where we are obtaining data – it can even be custom.  For monitors the data source is contained in a section called the monitortype.  So, in order to introduce a schedule (and other modifications – but thats beyond the scope of this blog entry) we are REALLY interested in the monitor type because that module is where we introduce our customizations. 

The first thing you need to know is the name of the monitor type that your monitor is referencing.  This is found on the definition line for the monitor and generally refers back to a system management pack.  We will look at the full XML in a minute but for now I’ve copied the monitor definition line.  We can tell that the monitortype (typeid) is the Microsoft.Windows.2SingleeventLog2StateMonitorType.  We also see the red section tells us that this monitortype is defined in the Microsoft.Windows.Library system library.  The red part of the monitortype definition is an alias that is defined in the references section of the XML.  I’ll show you in the XML shortly how all of this links up.

<UnitMonitor ID="Custom.Event.Rule_Monitor.with.Schedule.Standard.Event.Monitor.with.Alert.WITH.Schedule" Accessibility="Internal" Enabled="true" Target="Windows!Microsoft.Windows.Server.OperatingSystem" ParentMonitorID="Health!System.Health.AvailabilityState" Remotable="true" Priority="Normal" TypeID="Windows!Microsoft.Windows.2SingleEventLog2StateMonitorType" ConfirmDelivery="true">

Our event monitor works because it references the stated monitortype in the sealed system management pack.  As mentioned, the monitortype section is the place where we want to introduce our customizations – but since it’s in a sealed management pack we can’t adjust it there (we wouldn’t want to anyway).  The solution is to find this monitortype in the system management pack and copy it to our management pack and make an adjustment to use the local copy rather than the system copy.

The needed monitor type is in the microsoft.windows.library management pack (we know this because of our references) – open that system management pack (which will require you convert it to readable XML – Kevin has a good blog on how to export management packs here.  The Monitor Type of interest,  Microsoft.Windows.2SingleEventLog2.StateMonitorType is shown from the microsoft.windows.library management pack.  

clip_image056

All we need to do is paste this section inside our current management pack as shown.  Note that only enough of the management pack XML is shown to demonstrate where to place the section correctly.

clip_image058

If you jump ahead of me and try to save and import the management pack at this stage, it will fail!

Note that there are two more curve balls we have to deal with as shown in the XMLt.  These are dependencies for this moduletype – one is internal to the microsoft.windows.library management pack (note no alias) and the other is in the system (system.library) management pack. Dealing with these issues is very easy.  Let’s take them one at a time. 

Microsoft.Windows.BaseEventProvider – note that there is no alias on these two entries.  Why?  Because the expectation is that the definition for this provider is in the same management pack as was the monitortype we copied.  Remember that the only component we are interested in is the monitortype module so it’s completely fair to simply make an edit to add an alias to refer back to the microsoft.windows.library management pack. 

System!System.ExpressionFilter – In truth these entries wouldn’t cause an error.  Why?  Note that there is already an alias (system) defined on this entry that indicates these filters are in the system.library management pack.   Since the alias is already defined in our management pack it should work fine.

clip_image060

I’ve made the required changes to the XML below (circled in red) and now, if you try, the management pack will import – but we aren’t done with it yet!

clip_image062

OK, with all of the changes made there is one VERY IMPORTANT change we haven’t made yet.  We have to configure our scheduled monitor to use the monitor type we copied into our management pack.  To set this monitortype apart we will rename it and we will also reconfigure the unit monitor to use it.  The changes required are below.

clip_image064

Drum roll please – NOW we can open the XML in the authoring console to see what we’ve actually done.  Make sure you save the management pack with all of the changes and then open it up.  Go to the type library node under monitor types and we see that we have a monitor type defined!

clip_image066

Now, once we know how all the pieces fit together we could have just as easily built the monitor type totally in the UI – but remember I said it helps to know the UI and XML?  This is why – often you will go ‘XML diving’ to come up with an example of how an item should be configured!

So is it always going to be this involved to add a schedule?  No – the more you do this you will find shortcuts – and you can built monitortypes by hand once you have experience – but this is the easiest way to demonstrate how everything comes together.

Let’s see what changes resulted in the monitor type as a result of our XML editing.  Select properties on the monitortype.  First, notice on the General tab that we have an ID for our monitortype but there is no assigned name.  Why?  Monitortypes are not visible in the OpsMgr UI so there is no need for the name field to be complete in XML.  The authoring console, however, requires that this field be complete or it won’t allow changes made in this section to be saved.    Also notice an option to select what runas account this monitortype should use.  If specific credentials are required, select an appropriate runas account.  Our example uses the default.

clip_image068
On the states tab we see  that this monitortype is defined as a two state monitor.  The ID’s listed can be customized but doing so would require additional edits to the monitor itself which references these values.

clip_image070
The member module tab is where things get interesting. This is where all of the modules that make up our monitoring workflow are defined.  We can see that our datasource and filter modules are defined.  Let’s stop there for a minute.  Notice that there are two data sources and two filter modules.  Why?  Because this monitor type is a 2 state event monitor – meaning that there are two event logs and two event ID’s that need to be evaluated.  This, two data sources and two filters! 

clip_image072
Staying with the member module tab, this is where we need to add a module to handle scheduling of our monitor.  Select add and enter schedule in the look for box.  Notice that 4 options match the schedule filter.  Which one should we pick?  Looking at the role for each option the answer is quickly clear – we are building a condition detection schedule filter so we will choose the condition detection filter – and there is only one.  Once selected, enter OperationalScheduleCD as the module ID and select OK.

clip_image074
After selecting OK a screen should appear allowing you to configure the new filter.  This is now very similar to what we did with our rule.  Select configure and add a schedule where our monitor operates daily from 6:00 am to 11:00 pm as shown.

clip_image076
clip_image078
Once complete, the member module tab will look as follows

clip_image080
On the Regular tab we pull our modules just created together to form our workflow.  Notice that there are two items we have to handle – a condition were the first event is raised and a condition where the second event is raised.  This brings up an interesting point.  The goal here is to schedule our monitor so that it is only active during certain hours.  But, the REAL goal very likely is to prevent teams from getting alerts/pages when monitor is supposed to be off.  If, however, the monitor were to detect a condition where a healthy state resumed while monitoring is disabled, we likely would want to process that change.  It all boils down to how the schedule module is implemented. 

clip_image082

The first event raised is the event that causes state to change to unhealthy and alerts (and potentially pages) to fire.  We definitely want to include our schedule filter in the first event workflow.  To do so, put a check mark by the member module labeled CD (Be sure FirstEventRaised is highlighted) and adjust the workflow to pass information as follows

FirstDataSource ---> FirstFilterCondition
FirstFilterCondition ---> OperationalScheduleCD
OperationalScheduleCD ---> Monitor State Output

You could also have setup the flow differently so that the operational schedule came immediately after FirstDataSource.

With these settings the FirstEventWorkflow will only complete and output data during schedule hours.

clip_image084
There is also our SecondEventRaised workflow.  This is the one that will detect an event to put the monitor back in a healthy state so, to me, it doesn’t make sense to put this workflow on a schedule since we want to know that health has returned regardless of when and also because if we miss the healthy event there is no guarantee it will be produced again!.  Based on that, no modifications are required for the SecondEventRaised workflow. 

clip_image086
The remaining tabs are not relevant to building our scheduled monitor so we will pass them for now but the On Demand is interesting and deserves some comment.  In the OpsMgr UI health explorer section there is an option to recalculate health.  While this option is present and can be selected for every monitor in health explorer it does not work with every monitor – only those that are built to support On Demand health recalculation!  Ah, so THAT is what the On Demand node in our monitortype is for!  Correct.  IF you want your monitor to support On Demand health recalculation you have to configure the workflow to allow for it on the On Demand tab.  Our monitortype doesn’t support On Demand recalculation so we are passing it here.  A further wrinkle – not every monitortype CAN support recalculation – only those that use a probe based data source. 

OK – BIG EXHALE!!!!  We have now completed building our scheduled monitor!  WooHoo, celebration and excitement!  Before getting too out of hand, save your work in the authoring console and we will take a look at what we’ve done in the XML and see how all of our UI configurations maps out.  Like before, I’ll start with our monitor that doesn’t have a schedule to show the core UI components used to build a monitor and how they map to XML.  I’ll then show the XML specific to the monitortype we built, scheduling and how all of those changes link into the monitor.  Also, I will only be showing the portions of XML that are applicable to the subject at hand – some areas will be collapsed or omitted.  Lastly, for completeness I turned on word wrap so we will be able to see the full text rather than having some scroll off the screen.

clip_image088
OK, thats enough to numb the senses – lets dig in a bit deeper!  Now that we understand how to correlate a basic event monitor built in the UI to it’s resulting XML, let’s look at the XML that specifically ties a monitor to it’s monitortype and associated configuration.  As shown above, monitortypes generally are defined in the system management packs and accessed by reference.  For our scheduling example, we copied the needed monitor type and made some adjustments.  Here’s how it all fits together.  Note again that certain sections that aren’t applicable or that we have already covered are collapsed.

clip_image090
And thats it!  Hopefully this helps illustrate how XML links together and correlates to the UI.  the sample management pack that was built as part of this blog post is attached for reference.  Happy authoring/scheduling!

SCCM: Forcing a Task Sequence to Rerun

There are well known methods to force an advertisement to rerun – including several add-on tools available for the SMS or SCCM console.  To date, however, there are not equivalent methods to force a task sequence to rerun.  Part of this may be because task sequences are typically thought of as focused on Operating System Deployment (OSD) and rerunning these types of distributions are not as common as rerunning advertisements.

While task sequences are the best solution out there for OS Deployments they are much more flexible than just that – including distributing software in very complex scenarios including support of dynamic decisions during execution, handling reboots, enabling specific sequencing of application deployment, etc.  With this kind of power many organizations are using task sequences for software deployment and the ability to force a sequence to rerun on a selective basis and without having to manually logon to individual clients is crucial.  The process to make this happen is very easy.

First, identify your task sequence by ID.  My test sequence is CEN00027.
clip_image002

 

 

 

 

 

 


Note that in my lab this sequence has already run in the past.
clip_image002[7]

The advertisement for the sequence is set with a mandatory execution time – which resulted in the first run.  No other mandatory times have been added.  Further, the advertisement is set to allow rerunning.  If you had an advertisement set to not rerun you should be able to force it to rerun but this would likel require additional WMI and registry edits.  I haven’t tested that specific scenario.
clip_image002[12]

From here, open WMI on the client system of interest and connect to the root\ccm\scheduler namespace.
clip_image002[14]
Click ‘Enum Classes’, select Recursive and then scroll to the bottom and double click on CCM_Scheduler_History() and then click instances.
clip_image002[16]
clip_image002[18]
clip_image002[20]
In the list that shows up, find the entry that corresponds to your task sequence ID and delete it.
clip_image002[22]
With the deletion made, restart the SMS Agent Host service (CCMExec) on the target client.
clip_image002[24]
In a few minutes, the program balloon will pop up indicating the sequence is about to run again.
clip_image002[26]
The process described is manual but could be automated if desired. 

BING 411

Completely unrelated to anything SCCM or OpsMgr but still worth a post – do you find yourself avoiding dialing information to get phone numbers of information because it is too expensive?  Check out BING 411.  This service is completely free, will provide detailed information on whatever you are searching for and will also connect the call for you – all free.  And, to top all of that, the system is actually good – voice recognition is very accurate.  Try it out – 1-800-BING-411.

Posted by steverac | 0 Comments

Understanding Monitors in OpsMgr 2007 part III – Dependency Monitors

This is part 3 (and last) of my series of posts describing monitors.  The first post, found here, discussed unit monitors, which are the engine of monitoring. The second post, found here, discussed aggregate monitors, the ‘umbrella’ monitor sitting above unit monitors and reflecting their collective health forward – ultimately all the way to the class level. 

The dependency monitor is used to link classes that are in a hosting or containment relationship together, allowing health state from one class to affect the health of another class higher in the relationship structure than itself.  This type of monitor is confusing for many an OpsMgr admin.  Looking at the relationship structure of classes within OpsMgr one might think that the rollup of class health happens by default – it doesn’t.  Health will only roll up to the individual class level (by using unit and aggregate monitors) unless a dependency monitor is configured between classes.  An illustration from the Opsmgr Authoring Guide might help and is shown below.  In this scenario we have two objects, the SQL Server 2005 object (SQL Server 2005 DB Engine class) and the SQL Server Database object (SQL 2005 DB Class).  Using the unit and aggregate monitors, both objects can be monitored individually – but there is nothing that will allow problems with a database (SQL Server 2005 DB class) to reflect on the SQL Server 2005 object itself (SQL Server 2005 DB Engine). If we do need the ability to link health between the objects, the dependency monitor is the mechanism to do so.

image

The diagram shows a dependency monitor created that links the health state between the availability aggregate of the SQL Server Database object (SQL 2005 DB object) and rolls it up under the availability aggregate of the SQL Server 2005 database object (SQL 2005 DB Engine class).  You could link between the other aggregate categories or on the aggregate for the class itself, it’s up to you.

So how do we build this in the OpsMgr UI?  Let’s walk through it.

First, find the two classes you want to link together and evaluate what monitors are already in place. In this case, SQL 2005 DB and SQL 2005 DB Engine

image

We see that there is a Database Status unit monitor already configured and it will rollup it’s health to the availability aggregate and ultimately the entity aggregate but there is no dependency to roll the state up further.  Knowing that this view in the UI isn’t always as complete as we would like (more on that in a minute) and knowing that dependencies are created at the parent class in the relationship, in this case SQL 2005 DB Engine, we take a look at the SQL 2005 DB Engine objects in discovered inventory and look at health explorer for these objects and confirm there is no dependency.

Note:  Here we are specifically looking at the availability aggregate since the unit monitor we want to ‘link’ into is underneath this aggregate in our target class.

image

image 
      Note:  Here I’m looking at health explorer for my test computer object but it really   
      doesn’t make a difference which object you choose out of discovered inventory since   
      the rollup, when created, will be at the class level and will be displayed on all objects as
      a result.

From this we know that there is no dependency, so we will build one.  Back in authoring we select to create a dependency monitor.

image

     Note:  You could also create the dependency monitor (or any other monitor) directly on 
     the node of interested which will fill in some of the information for you in the wizard.

image

As shown above, the target for the monitor will be the class where we want health to ROLL UP and not the class that is reporting the health.  In this case, SQL 2005 DB Engine will be our target.

The next page is where we link our classes

image

In the above screen we are configuring our dependency monitor to link into the SQL Database child object and consider the health state of this child object when calculating health of the hosting class.  Notice that in this window we are actually linking to the more generic SQL Database object (the only option) rather than to the specific SQL 2005 Database object. 

OK, wait a minute – this is confusing.  So we want to create a dependency to link the health that results from the ‘database status’ unit monitor that is specifically created to monitor objects in the SQL 2005 DB class – but our dependency can’t link directly to the SQL 2005 DB class but, instead, has to link to the SQL DB class?  Come on Steve – I look directly at the SQL DB class, the one we can link to, and it doesn’t have ANY monitors defined on it.  You also said earlier that without a dependency monitor health rolls up to the top of the CLASS and stops.  In this example that means health would not go past the SQL 2005 DB class so how can I get ANY health rollup by building a dependency to the SQL DB class which has no monitors?  Yes, I understand the confusion but, trust me, it works.  All of what I mentioned earlier is true.  Health doesn’t roll up from class to class unless there is a dependency.  You might call me a liar if you were to cause the database status monitor to go red and then look at the SQL 2005 DB class and the SQL DB class in health explorer.  The screenshots below seem to contradict what I’m saying because when the Database Status monitor is healthy, both classes are healthy and when the Database Status monitor is unhealthy, both classes are unhealthy.

SQL 2005 DB Class                                   SQL Database Class

image image

image image

While my statement that health will NOT rollup past the individual class level without a dependency is true it is also true that in some cases health generated on one class will reflect, not rollup, on it’s parent class.  The reason for this is a bit complicated to explain but a good way to predict when this will be the case is where one class specializes another class but what is ultimately being described is the same.  In this case, the SQL 2005 DB class is a more specialized for of the SQL DB class so the type of special relationship we are talking about exists.  OK, lets continue building our dependency monitor!

On the next screen we simply configure how health should be considered.

image

So what impact did this change have?  Take a look back at our two classes and we will now see the dependency monitor.  While the UI doesn’t clearly reflect it we now know based on our configuration and earlier discussion that the new dependency monitor will evaluate the health state of the availability aggregate from our child class when evaluating health.

So with the dependency monitor in place here is how health rollup really happens.  The health state changes on our unit monitor, rolling up to the availability agregate in the SQL 2005 DB class.  Because or our ‘special’ situation with these classes, health from SQL 2005 DB availability aggregate will reflect on the availability aggregate of the SQL Database class.  Our dependency monitor, created on the SQL 2005 DB engine availability aggregate, will ‘link’ into the availability aggregate defined on the SQL Database class and ‘roll up’ health according to the rules defined on the dependency.

image

For another view of this rollup taking place we look at health explorer for the SQL 2005 DB Engine class.  Health explorer is generally the clearest way to see these rollups and make sense of them.

image

Earlier I mentioned that using the authoring section of the UI to try and follow a dependency monitor and understand the classes it links together is potentially problematic.  This is illustrated by the above diagrams.  The dependency is created at the SQL 2005 DB engine level – and the dependency is visible at this level.  But looking at the class the dependency links to alone doesn’t give any ability to know that a dependency is actually in operation.  Health explorer is the right way to visualize the total picture of the health structure.  It still may be difficult to see at a glance which monitors are aggregates or dependencies but by working with the structure a bit, it isn’t that difficult to follow.

Posted by steverac | 0 Comments
Filed under:

Categories Filter Missing in OpsMgr R2 Notifications

In OpsMgr SP1, the option to filter notifications by category was available in the UI. 

image

In R2, many changes were made to improve the flexibility of notification subscriptions but the category option was not included. 

image

Does that mean you can no longer create notifications filtered by category?  In the UI, yes.  But it is possible to include the category filter if you edit the XML itself.  All notification detail in OpsMgr is stored in the ‘Notifications Internal Library’ management pack.  Just export this management pack and you are able to add back your category filters – an example is shown below.  The section specifying category criteria is in italics.  Note that if you choose to make edits to re-enable category filtering you should no longer plan to edit your subscriptions in the UI.  If you do and save a subscription setting, the category sections will be overwritten.

<Monitoring>
  <Rules>
    <Rule ID="Subscription04bafdd7_653e_46c1_aa0f_202d186150dd" Enabled="true" Target="Notification!Microsoft.SystemCenter.AlertNotificationSubscriptionServer" ConfirmDelivery="false" Remotable="true" Priority="Normal" DiscardLevel="100">
      <Category>Notification</Category>
      <DataSources>
        <DataSource ID="DS1" RunAs="SystemCenter!Microsoft.SystemCenter.DatabaseWriteActionAccount" TypeID="SystemCenter!Microsoft.SystemCenter.SubscribedAlertProvider">
          <AlertChangedSubscription Property="Any">
            <Criteria>
              <Expression>
                <And>
                  <Expression>
                    <SimpleExpression>
                      <ValueExpression>
                        <Property>Severity</Property>
                      </ValueExpression>
                      <Operator>Equal</Operator>
                      <ValueExpression>
                        <Value>2</Value>
                      </ValueExpression>
                    </SimpleExpression>
                  </Expression>
                  <Expression>
                    <Or>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Priority</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>2</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Priority</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>1</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                    </Or>
                  </Expression>
                  <Expression>
                    <Or>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>PerformanceCollection</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>Operations</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>EventCollection</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>StateCollection</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>SoftwareAndUpdates</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>Alert</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>System</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>Custom</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>AvailabilityHealth</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>PerformanceHealth</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>ConfigurationHealth</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>SecurityHealth</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>Discovery</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>Notification</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>Category</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>Maintenance</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                    </Or>
                  </Expression>
                  <Expression>
                    <Or>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>ResolutionState</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>0</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                      <Expression>
                        <SimpleExpression>
                          <ValueExpression>
                            <Property>ResolutionState</Property>
                          </ValueExpression>
                          <Operator>Equal</Operator>
                          <ValueExpression>
                            <Value>255</Value>
                          </ValueExpression>
                        </SimpleExpression>
                      </Expression>
                    </Or>
                  </Expression>
                </And>
              </Expression>
            </Criteria>

Posted by steverac | 0 Comments

Understanding Monitors in Opsmgr 2007 part II Aggregate Monitors

This is part 2 of my a series of posts describing monitors.  The first post discussed unit monitors and can be found here.  This post discusses aggregate monitors.  As mentioned in the first post, unit monitors can be thought of as the workhorse of monitoring.  Unit monitors are just that – a unit of monitoring.  A self contained engine to monitor a specific item and reflect the result in terms of health state, alerting and diagnostic/recovery.  Aggregates act as a collector and consolidator of information and ultimately reflect the collective result of unit monitors.  For any defined class in OpsMgr there are 5 defined aggregate monitors – Entity Health, Availability, Configuration, Performance and Security.  This is shown below for the Windows Computer object.

image

Until now you might have just thought of these as categories you can use for grouping similar unit monitors together – and they are useful for this – but these are much more than categories.  If we look at the properties of the availability aggregate as an example we quickly see that this monitor itself can be an engine for alerting and is configurable to reflect the health of it’s contained unit monitors.  We even have diagnostic and recovery options available for an aggregate monitor.

image

image

So when we configure a unit monitor we now understand that the setting to specify a parent monitor isn’t just cosmetic – it’s important.  This setting directly dictates where an unhealthy unit monitor will have an impact.  If we choose availability, the health state of our unit monitor (and all others under the availability aggregate) will be ‘watched’ and their collective health ultimately ‘rolled up' to and reflected on the aggregate itself.  This offers some interesting possibilities.  If, for example, you don’t want to generate an alert based on a single unit monitor but would prefer to alert only when all of the unit monitors are unhealthy, the aggregate allows you to do just that!

So we have the availability, configuration, performance and security aggregates and we understand that these are default aggregates that are part of every monitoring object and we understand that unit monitors that are configured under each directly ‘roll up’ their collective health to the aggregates.  In addition to this, we can create our own aggregates and plug them into the monitoring structure.  So, for example, if we wanted to subdivide our unit monitors under the availability aggregate we could create another aggregate inline as shown.

image

Here we have 6 unit monitors that operate and generate state that is collected by the FileSystem Monitors aggregate.  So, the lower six unit monitors do not of their own operation have direct impact on availability health.  In this example, if any of the 6 unit monitors under FileSystem Monitors go unhealthy that state will be reflected on the FileSystem Monitors aggregate itself and the health of the FileSystem Monitors object will be reflected forward to the general Availability aggregate.

OK, so we have the four categories of unit monitors and we know we can add our own aggregate monitors and specify where they should plug into the category model.

image 

image

But, even these four top level categories of aggregate monitors themselves combine and are rolled up to the top most aggregate for a class – the entity aggregate.

image

So, ultimately, the overall health of the availability, configuration, performance, security and whatever other custom aggregates are in place on an object are taken into account based on their ‘watched’ unit monitors and rolled forward to the Entity aggregate.  The Entity aggregate will reflect the overall health of the object in health explorer so at a glance we can tell if the monitored object, in this case Windows Computer, has any issue causing it to be unhealthy.  I’ve put health explorer for a Windows Computer object next to the same object in the authoring > monitors view to show how they map together.

image

So we have the health of all unit monitors targeted to a particular class rolling their state forward to finally be shown in the health of the class itself.  So once we have the health reflected on the class, where do we go from there?  A common thought is that the health just continues to roll up along the relationship chain between objects.  That is not correct.  In terms of health state, the buck stops at the class itself.  But, wait – you are about to scream that you have seen an unhealthy class roll up to and impact the health of a class higher up in the relationship chain….yes you have – but that isn’t done automatically.  It requires dependency monitors and they will be the focus of part III of our series.

Posted by steverac | 2 Comments
Filed under:
More Posts Next page »
 
Page view tracker