• musc@> $daniele.work.ToString()

    Azure Operational Insights Search How To: Part VII – Measure Sum() and Where command

    • 0 Comments

    This is the seventh installment of a Series that walks thru the concepts of Microsoft Azure Operational Insights Search Syntax – while the full documentation and syntax reference is here, these posts are meant to guide your first steps with practical examples. I’ll start very simple, and build upon each example, so you can get an understanding of practical use cases for how to use the syntax to extract the insights you need from the data.

    In my first post I introduced filtering, querying by keyword or by a field’s exact value match, and some Boolean operators.

    In the second post I built upon the concepts of the first one, and introduced some more complex flavors of filters that are possible. Now you should know all you need to extract the data set you need.

    In the third post I introduced the use of the pipeline symbol “|” and how to shape your results with search commands.

    In the fourth post I introduced our most powerful command – measure – and used it with just the simplest of the statistical functions: count().

    In the fifth post I expanded on the measure command and showed the Max() statistical function.

    in the sixth post I continued with measure’s statistical functions – I showed how Avg() is useful with Performance data among other things.

     

    By now you should have grasped how ‘measure’ works, therefore I will not actually spend time on the Sum() function, other than this mention – if you want to see an example of its use, refer to this blog post http://blogs.msdn.com/b/dmuscett/archive/2014/09/20/w3c-iis-logs-search-in-system-center-advisor-limited-preview.aspx where I already showed how to use it to get aggregate amount of traffic download to a given IP from a webserver (IIS logs):

    Type=W3CIISLog | Measure Sum(scBytes) by cIP

    Interesting/additional thing to notice before we move on: you can use Max() and Min() with numbers, datetimes and strings. with strings, they basically get sorted alphabetically and you get first and last.

    You cannot however use Sum() – which does a REAL calculation – with anything but numerical fields. Same applies to Avg().

     

    Where

    The last command we have is Where.

    Where works like a filter, but it can be applied in the pipeline to further filter ‘aggregated’ results that have been produced by a Measure command – as opposed to ‘raw’ results that get filtered at the beginning of a query.

    Given this query:

    Type=PerfHourly  CounterName="% Processor Time"  InstanceName="_Total" | Measure Avg(SampleValue) as AVGCPU by Computer

    I can add another pipe “|” character and the Where command to only get computers whose average CPU is above 80%

    Type=PerfHourly  CounterName="% Processor Time"  InstanceName="_Total" | Measure Avg(SampleValue) as AVGCPU by Computer | Where AVGCPU>80

    You see what this represents?

    If you are a system center person, talking in ‘management pack terms’, if this was a rule, the first part would be the ‘data source’, while the where command becomes the condition detection. (see my other dissertation on ‘rules’ and ‘searches’ similarities in this other post http://blogs.msdn.com/b/dmuscett/archive/2014/11/05/iis-mp-event-alerting-rules-s-opinsights-searches-equivalents.aspx )

    Where’s the write action, you might ask?

    Well, if you bind this to a tile in ‘My Dashboard’, this essentially is a ‘monitor’ that lets you see in real time if machines are under CPU pressure – soon even on your phone – as shown below, in the bottom two tiles, both as a list and as a number: you basically always want the number to be zero and the list to be empty – otherwise it indicates an alert condition, essentially, and you can peek which machines are under pressure:

    Azure Operational Insights Mobile App - My Dashboard

     

    Welcome to monitoring in the modern world!

    For other style of ‘write actions’, consider this idea in our feedback forum for the actual creation of alerts and other notification (email etc) whenever those ‘where’ criteria (really if you think of them, those are ‘thresholds’) are met http://feedback.azure.com/forums/267889-azure-operational-insights/suggestions/6519198-run-saved-search-on-a-schedule-raise-alert-and-or or this other one to kick off an automation runbook in response to a query matchin results http://feedback.azure.com/forums/267889-azure-operational-insights/suggestions/6658045-add-ability-to-execute-runbook-automation-from-aoi

    This concludes the planned ‘HowTo’ series on the search syntax. I hope it was useful and enjoyable. The idea was to let new users read one post a day and having learned the syntax and be productive with it in a week!

    As we add more capabilities and commands I will continue it. Please remember the official syntax reference can be found here http://technet.microsoft.com/en-us/library/dn500940.aspx and that Stefan Roth has also produced a handy cheat sheet on his blog http://stefanroth.net/2014/11/05/microsoft-azure-operational-insights-search-data-explorer-cheat-sheet/

    I will also likely blog some more scenario-focused posts, i.e. same use case and how to go thru more examples of ‘searches in action’.

    I also keep updating this other post as I come up with some new useful queries http://blogs.msdn.com/b/dmuscett/archive/2014/10/19/advisor-searches-collection.aspx

    Remember the feedback forum is always open for you to vote and ask what you would like us to improve in the service at http://feedback.azure.com/forums/267889-azure-operational-insights/

    Happy searching!

  • musc@> $daniele.work.ToString()

    IIS MP Event-Alerting Rules’s OpInsights Searches Equivalents

    • 0 Comments

    Just for the kicks, the other day I run an Excel export of the IIS8 MP with MPViewer, and then I applied a bunch of filters and removed several columns and copied out to notepad all the key info for the Alerting Rules that are based on Windows Events. And then I added field names and constructed equivalent search queries. The whole thing took me about 20 minutes or find&replace-Fu…

    MPViewer's Excel Export of the IIS2012R2MP - Alerting Rules

    Why did I do this, you might be asking? Well, you must have realized that, with Azure Operational Insights’s Log Management and Search capabilities, you would be able to collect the same event logs (mostly System and Application event logs is what is used by IIS) and using the Search query syntax, get those results – wait, if those were alerting rules, this means when one of those events was seen, it would generate al alert, right?

    Well, you must have noticed that in a dashboard we feature a couple of style of tiles, in particular you can see a time-based distribution of occurrences of search results (those events) – so you can keep under control if and WHEN those errors happen, in a easy to reach way:

    Time distribution of ASP Errors - Tile in 'My Dashboard' @OpInsights

    or – if you don’t want those events to ever occur (after all, if they occur, you would alert, right?) you can keep them under control with the numeric tile and set a threshold – if more than ZERO (or more, if you want to tolerate a few – and you can add a time filter to the query to make it act on the most recent time period… this would be your ‘repeated event’ criteria, in ‘management packs’ terms…) then you can color the tile:

    Total ASP Errors in the observed time range - Tile in 'My Dashboard' @OpInsights

     

     

    In the future, we’d like to enable long running or scheduled searches that would produce ‘actions’ – i.e. produce an alert, notify via email, kick off a runbook…. and visualize your dashboard on your smartphone! Check out and vote those ideas, and let’s enable monitoring in a modern hybrid world!

     

    And now, here’s the searches I produced by running this small experiment – Try them out, and have fun searching! (and you can try repeating the experiment yourself with other MP’s too Smile)

     

    A script has not responded within the configured timeout period
    EventID=2216 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    A server side include file has included itself or the maximum depth of server side includes has been exceeded
    EventID=2221 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    ASP application error occurred
    EventID=500 OR EventID=499 OR EventID=23 OR EventID=22 OR EventID=21 OR EventID=20 OR EventID=19 OR EventID=18 OR EventID=17 OR EventID=16 OR EventID=9 OR EventID=8 OR EventID=7 OR EventID=6 OR EventID=5 Source="Active Server Pages" EventLog=Application

    HTTP control channel for the WWW Service did not open
    EventID=1037 Source="Microsoft-Windows-IIS-W3SVC" EventLog=System

    HTTP Server could not create a client connection object for user
    EventID=2208 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    HTTP Server could not create the main connection socket
    EventID=2206 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    HTTP Server could not initialize its security
    EventID=2201 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    HTTP Server could not initialize the socket library
    EventID=2203 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    HTTP Server was unable to initialize due to a shortage of available memory
    EventID=2204 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    ISAPI application error detected
    EventID=2274 OR EventID=2268 OR EventID=2220 OR EventID=2219 OR EventID=2214 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    Module has an invalid precondition
    EventID=2296 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    Module registration error detected (failed to find RegisterModule entrypoint)
    EventID=2295 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    Module registration error detected (module returned an error during registration)
    EventID=2293 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    Only one type of logging can be enabled at a time
    EventID=1133 Source="Microsoft-Windows-IIS-W3SVC" EventLog=System

    SF_NOTIFY_READ_RAW_DATA filter notification is not supported in IIS 8
    EventID=2261 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    The configuration manager for WAS did not initialize
    EventID=5036 Source="Microsoft-Windows-WAS" EventLog=System

    The directory specified for caching compressed content is invalid
    EventID=2264 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    The Global Modules list is empty
    EventID=2298 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    The HTTP server encountered an error processing the server side include file
    EventID=2218 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    The server failed to close client connections to URLs during shutdown
    EventID=2258 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    The server was unable to acquire a license for a SSL connection
    EventID=2227 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    The server was unable to allocate a buffer to read a file
    EventID=2233 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    The server was unable to read a file
    EventID=2226 OR EventID=2230 OR EventID=2231 OR EventID=2232 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    WAS detected invalid configuration data
    EventID=5174 OR EventID=5179 OR EventID=5180 Source="Microsoft-Windows-WAS" EventLog=System

    WAS encountered a failure requesting IIS configuration store change notifications
    EventID=5063 Source="Microsoft-Windows-WAS" EventLog=System

    WAS encountered an error attempting to configure centralized logging
    EventID=5066 Source="Microsoft-Windows-WAS" EventLog=System

    WAS encountered an error attempting to look up the built in IIS_IUSRS group
    EventID=5153 Source="Microsoft-Windows-WAS" EventLog=System

    WAS encountered an error trying to read configuration
    EventID=5172 OR EventID=5173 Source="Microsoft-Windows-WAS" EventLog=System

    WAS is stopping because it encountered an error
    EventID=5005 Source="Microsoft-Windows-WAS" EventLog=System

    WAS received a change notification, but was unable to process it correctly
    EventID=5053 Source="Microsoft-Windows-WAS" EventLog=System

    WAS terminated unexpectedly and the system was not configured to restart it
    EventID=5030 Source="Microsoft-Windows-WAS" EventLog=System

    Worker process failed to initialize communication with the W3SVC service and therefore could not be started
    EventID=2281 Source="Microsoft-Windows-IIS-W3SVC-WP" EventLog=Application

    WWW Service did not initialize the HTTP driver and was unable to start
    EventID=1173 Source="Microsoft-Windows-IIS-W3SVC" EventLog=System

    WWW service failed to configure the centralized W3C logging properties
    EventID=1135 OR EventID=1134 Source="Microsoft-Windows-IIS-W3SVC" EventLog=System

    WWW Service failed to configure the HTTP.SYS control channel property
    EventID=1020 Source="Microsoft-Windows-IIS-W3SVC" EventLog=System

    WWW service failed to configure the logging properties for the HTTP control channel
    EventID=1062 Source="Microsoft-Windows-IIS-W3SVC" EventLog=System

    WWW Service failed to copy a change notification for processing
    EventID=1126 Source="Microsoft-Windows-IIS-W3SVC" EventLog=System

    WWW Service failed to enable end point sharing for the HTTP control channel
    EventID=1175    Microsoft-Windows-IIS-W3SVC    System

    WWW service failed to enable global bandwidth throttling
    EventID=1071 OR EventID=1073 Source="Microsoft-Windows-IIS-W3SVC" EventLog=System

    WWW service property failed range validation
    EventID=5067 Source="Microsoft-Windows-WAS" EventLog=System

  • musc@> $daniele.work.ToString()

    Azure Operational Insights Search How To: Part VI – Measure Avg(), and an exploration of Type=PerfHourly

    • 0 Comments

    This is the sixth installment of a Series that walks thru the concepts of Microsoft Azure Operational Insights Search Syntax – while the full documentation and syntax reference is here, these posts are meant to guide your first steps with practical examples. I’ll start very simple, and build upon each example, so you can get an understanding of practical use cases for how to use the syntax to extract the insights you need from the data.

    In my first post I introduced filtering, querying by keyword or by a field’s exact value match, and some Boolean operators.

    In the second post I built upon the concepts of the first one, and introduced some more complex flavors of filters that are possible. Now you should know all you need to extract the data set you need.

    In the third post I introduced the use of the pipeline symbol “|” and how to shape your results with search commands.

    In the fourth post I introduced our most powerful command – measure – and used it with just the simplest of the statistical functions: count().

    In the fifth post I expanded on the measure command and showed the Max() statistical function. And gave ‘homework’ to try Min() on your own Winking smile

     

    This time, I will show you one other favorite of mine to be used with measure, the Avg() statistical function. As you can imagine, this allows you to calculate the average value for some field, and group results (as usual with Measure – we discussed this in the 4th post) by some (other or same) field.

     

    This is useful in a variety of cases, my favorite one being Performance data, but there are other interesting use cases.

    Let’s start with Performance data, anyhow, as it probably is the easiest to understand.

    Before I do that, let me remind you that – at the time of this writing – we only collect specific fabric-related performance counters for VMM and HyperV hosts as part of the ‘Capacity’ Intelligence Pack.

    We are anyhow tracking ideas to collect Custom-defined Performance Counters on our feedback forum – go ahead and vote on them if you would like us to enable that functionality!

    For these ideas, we would also like to understand the granularity of collection you expect to see – the current performance data collected by Capacity Intelligence Pack is only indexed in pre-computed hourly aggregations. What does this mean? It means that for each Computer\Performance Object\Performance Counter Name\Instance Name (as in perfmon) we currently index ONE RECORD in search for each hourly interval (24/day).

    Let’s see what this means. If I search for ALL performance data, the most basic query would be

    Type=PerfHourly

    the first thing you notice is that OpInsights shows you also charts of those performance counters

    Type=PerfHourly

    this is convenient, of course, but let’s scroll to the bottom and look at the actual records that are behind those charts, and how do they look like

    PerfHourly record in OpInsights Search

     

    Look at the screenshot above and the two sets of fields I circled in red and notice a few things:

    • the first set lets you identify the Windows Performance Counter Name, Object Name and Instance Name in your query filter. These are the fields you probably will most commonly use as facets/filters.
    • ‘SampleValue’ is the actual value of the counter (more in a second on this…)
    • Type=PerfHourly = this is an hourly aggregate
    • TimeGenerated: it is at 21 o’clock. It is the aggregation for that hourly bucket from 20:00 to 21:00
    • SampleCount - the aggregation was computed using 12 samples (one every 5 minutes)
    • the minimum, maximum and 95th percentile for the hourly bucket was – in this case for memory for one of my VM’s – always 6144 (megabytes)
    • ‘SampleValue’ – back to it - since this record is a hourly aggregate, ‘SampleValue’ is populated with the AVERAGE for the hourly bucket.
      • this field is what is used to plot the performance charts
      • we left this name (rather than a more explicit Avg) because this way we can later introduce ‘raw’ (non pre-aggregated) performance data (where ‘SampleValue’ will be the actual value picked from perfmon at a specific time, unprocessed) and by keeping this field name the same, both the per chart as well as the most common queries you’d write will continue to work, unchanged, across ‘raw’ or ‘aggregated’ data.

    So after all this explaining of the PerfHourly record ‘shape’, and having read the previous blog posts, you can now understand how to use measure Avg() to aggregate this type of very ‘numerical’ data.

    Simple example:

    Type=PerfHourly  ObjectName:Processor  InstanceName:_Total  CounterName:"% Processor Time" | Measure Avg(SampleValue) by Computer

    Type=PerfHourly  ObjectName:Processor  InstanceName:_Total  CounterName:"% Processor Time" | Measure Avg(SampleValue) by Computer

    By now, with all the previous blog posts and examples and filtering and using measure with other functions, this should be clear – we select the CPU Total Time performance counter and we ask an Average by Computer. Easy.

    You might notice that, since ‘SampleValue’ is already an average, you are really asking an average of an average. That’s correct with Type=PerfHourly at the moment, but you see where this will be more precise and useful when we’ll have a raw Type=Perf, or similar, not pre-aggregated. For the time being, we suggest always throwing a filter on TimeGenerated (see 2nd post where we talked about time filters) to restrict the operation to a small/recent dataset – i.e. the last 4 hours, not 7 days!

    So our query above becomes

    Type=PerfHourly  ObjectName:Processor  InstanceName:_Total  CounterName:"% Processor Time" TimeGenerated>NOW-4HOURS | Measure Avg(SampleValue) by Computer

    Try it now. You will see this recent average will generally be higher.

    Or with a twist, you could calculate the average of the Maximum hourly values, i.e.

    Type=PerfHourly  ObjectName:Processor  InstanceName:_Total  CounterName:"% Processor Time" TimeGenerated>NOW-4HOURS | Measure Avg(Max) by Computer

     

    Even more interesting – one of my absolute favorite scenarios – is aggregating/correlating data ACROSS machines, something that was VERY HARD to do with System Center Operations Manager and something I really wanted to simplify.

    Let’s imagine we have a set of hosts in some sort of farm where each node is equal to any other one and they just do all the same type of work and load should be roughly balanced… I could get their counters all in one go with the following query and get averages for the entire farm! Let’s start with choosing the computers

    Type=PerfHourly AND (Computer=”SERVER1.contoso.com” OR Computer=”SERVER2.contoso.com” OR Computer=”SERVER3.contoso.com”)

    (note that today you have to do an OR – but this could become a subsearch in the future, that could be used to express a ‘dynamic group’ – check out and vote this idea on our feedback forum in this regard http://feedback.azure.com/forums/267889-azure-operational-insights/suggestions/6519209-allow-subqueries-in-the-search-language-in-not 

     

    In any case, now that we have the computers, we also only want to select two KPI – % CPU Usage and % Free Disk Space. This part of the query becomes:

    Type=PerfHourly  InstanceName:_Total  ((ObjectName:Processor AND CounterName:"% Processor Time") OR (ObjectName="LogicalDisk" AND CounterName="% Free Space")) AND TimeGenerated>NOW-4HOURS

    If we now put all together – machines and counters:

    Type=PerfHourly  InstanceName:_Total  ((ObjectName:Processor AND CounterName:"% Processor Time") OR (ObjectName="LogicalDisk" AND CounterName="% Free Space")) AND TimeGenerated>NOW-4HOURS AND (Computer=”SERVER1.contoso.com” OR Computer=”SERVER2.contoso.com” OR Computer=”SERVER3.contoso.com”)

    and now that we have this very specific selection, the measure Avg() command can tell us the average not by computer, by across the farm, simply by grouping by CounterName:

    Type=PerfHourly  InstanceName:_Total  ((ObjectName:Processor AND CounterName:"% Processor Time") OR (ObjectName="LogicalDisk" AND CounterName="% Free Space")) AND TimeGenerated>NOW-4HOURS AND (Computer=”SERVER1.contoso.com” OR Computer=”SERVER2.contoso.com” OR Computer=”SERVER3.contoso.com”) | Measure Avg(SampleValue) by CounterName

    which gives me this beautiful compact view over a couple of my farm’s KPI’s – at a glance I know:

    | measure avg(SampleValue) by CounterName

    and this, btw, looks great in a dashboard:

    image

    I am hoping with these examples you are starting to see how easy it is to quickly look at your environment’s data with the use of the search filters and commands.

    Till next time, happy searching!

  • musc@> $daniele.work.ToString()

    Azure Operational Insights Search HowTo: Part V – Max() and Min() Statistical functions with Measure command

    • 1 Comments

    This is the fifth installment of a Series (I don’t know yet how many posts they will be in the end, but I had at least 5 in mind at this point… and as I am writing I realize I won’t be done with this one…) that walks thru the concepts of Microsoft Azure Operational Insights Search Syntax – while the full documentation and syntax reference is here, these posts are meant to guide your first steps with practical examples. I’ll start very simple, and build upon each example, so you can get an understanding of practical use cases for how to use the syntax to extract the insights you need from the data.

    In my first post I introduced filtering, querying by keyword or by a field’s exact value match, and some Boolean operators.

    In the second post I built upon the concepts of the first one, and introduced some more complex flavors of filters that are possible. Now you should know all you need to extract the data set you need.

    In the third post I introduced the use of the pipeline symbol “|” and how to shape your results with search commands.

    In the fourth post I introduced our most powerful command – measure – and used it with just the simplest of the statistical functions: count().

     

    So let’s continue from where we left, and let’s explore some of the other statistical functions you can use with measure.

    Measure Max() and Measure Min()

    There are various scenarios where these are useful. I will only illustrate Max() and will leave Min() as an exercise for the reader – since it does the exact opposite of the other one.

    Let’s start with a simple example. If I query for ‘Advisor’ Configuration Assessment Alerts, they have a ‘Severity’ property which is either 0,1 or 2 (meaning info/warning/critical):

    Type=Alert

    If I want to see what is the HIGHEST value for all of the alerts given a common ‘Computer’ (the ‘group by’ field), I can write

    Type=Alert | Measure Max(Severity) by Computer

    and it will show me that – for the computers that had ‘Alert’ records, most of them have at least a critical one, and the ‘BaconSCOM’ machine has a warning as ‘worst’ severity:

    Type=Alert | Measure Max(Severity) by Computer

    This of course works well with NUMBERs, but it can also work with DateTime fields, i.e. it is very useful to check what is the last/most recent timestamp for any piece of data indexed for each computer, i.e.

    When was the most recent configuration change reported by change tracking Intelligence Pack for each machine?

    Type=ConfigurationChange | Measure Max(TimeGenerated) by Computer

    Type=ConfigurationChange | Measure Max(TimeGenerated) by Computer

     

    Hope this got you started on measure Max() – now try measure Min().

    In the next Installment we’ll look at the AVG (average) statistical function with the measure command – particularly interesting with performance/capacity data!

    Stay tuned for that, and happy searching!

  • musc@> $daniele.work.ToString()

    Azure Operational Insights Search How To: Part IV – Introducing the MEASURE command

    • 1 Comments

    This is the fourth installment of a Series (I don’t know yet how many posts they will be in the end, but I have at least 5 in mind at this point…) that walks thru the concepts of Microsoft Azure Operational Insights Search Syntax – while the full documentation and syntax reference is here, these posts are meant to guide your first steps with practical examples. I’ll start very simple, and build upon each example, so you can get an understanding of practical use cases for how to use the syntax to extract the insights you need from the data.

    In my first post I introduced filtering, querying by keyword or by a field’s exact value match, and some Boolean operators.

    In the second post I built upon the concepts of the first one, and introduced some more complex flavors of filters that are possible. Now you should know all you need to extract the data set you need.

    In the third post I introduced the use of the pipeline symbol “|” and how to shape your results with search commands.

     

    Today I will start talking (it will take more than one post) of our most versatile command so far: Measure.

    Measure allows you to apply statistical functions to your data and aggregate results ‘grouped by’ a given field. There are multiple statistical functions that Measure supports. it might sound all complicated at this point, but as we walk thru some of them with examples I’m sure they’ll become clearer.

    Measure count()

    The first statistical we’ll work with (and the simplest to understand) is the count() function.

    Given a search query, i.e.

    Type=Event

    Type=Event

    you should already know the ‘filters’ (previously called ‘facets’) on the left end of the screen show you a distribution of values by a given field for the results in the search you executed.

    For example in the screenshot above I am looking at the ‘Computer’ field – it tells me that, within the almost 3 million ‘Events’ I got back as results, there are 20 unique/distinct values for the ‘Computer’ field in those records. The tile only shows the top 5 (the most common 5 values that are written in the ‘Computer’ fields), sorted by the number of documents that contain that specific value in that field. From the screenshot I can see that – among those almost 3 million events – 880 thousand come from the DMUSCETT-W2012 computer, 602 thousand from the DELLMUSCETT machine, and so forth…

    What if I want to see all values, since the tile only shows only the top 5?

    That’s what measure command will let you do with the count() function. This function takes no parameters, and you just specify the field by which you want to ‘group by’ – the ‘Computer’ field in this case:

    Type=Event | Measure count() by Computer

    Type=Event | Measure count() by Computer

    But ‘Computer’ is just a field IN each piece of data – no relational databases here, there is no separate ‘Computer’ object anywhere. Just the values IN the data can talk about which entity generated them, and a number of other characteristics and aspects of the data – hence the term ‘facet’. But you can just as well group by other fields. Since our ‘original’ results (the almost 3 million events that we are piping into the ‘measure’ command) have also a field called EventID, we can apply the same technique to group by that field and get a count of events by EventID:

    Type=Event | Measure count() by EventID

    Type=Event | Measure count() by EventID

    And if you are not interested in the actual ‘count’ of records that contained a specific value, but only want a list of the values themselves, try adding a ‘Select’ command at the end of it and just select the first column:

    Type=Event | Measure count() by EventID | Select EventID

    Type=Event | Measure count() by EventID | Select EventID

    and you can even get fancy and pre-sort the results in the query (or you can just click the columns in the grid too)

    Type=Event | Measure count() by EventID | Select EventID | Sort EventID asc

    You should have gotten the idea with this couple of examples. It should be fairly straightforward. Try doing your own searches featuring the Measure count() now!

    There are a couple important things and caveats to notice and/or emphasize:

    1. The ‘Results’ we are getting are NOT the original ‘raw’ results anymore – they are ‘Aggregated’ results – essentially ‘groups’ of results. Nothing to worry about, just need to understand you are interacting with a very different ‘shape’ of data (different than the original ‘raw’ shape) that gets created on the fly as a result of the aggregation/statistical function.
    2. Measure count today (at the time of this writing) only returns the TOP 100 distinct results. This limit does not apply to the other statistical functions we’ll talk about later. Anyhow, we have a tracking item on the feedback forum that you might want to vote on if this limit is annoying to you. You typically just have to have a more granular filter first (looking for more specific things), before applying the measure count() command is the workaround you have today for this behavior (i.e. rather than asking for all computers that have reported events, you probably are more interested in just the computer that have reported a SPECIFIC error EventID, and similar scenarios).

    In the next installment we’ll look at other statistical functions such as AVG, MIN, MAX, SUM and more!

    Till then, happy searching!

    Remember that all of the search-related feature requests we have are visible and tracked on the Azure Operational Insights feedback forum in their own category. Come and vote on the ones that matter to you, and suggest your own ideas!

  • musc@> $daniele.work.ToString()

    Azure Operational Insights Search How To: Part III – Manipulating Results: the pipeline “|” and Search Commands

    • 1 Comments

    [Edited October 27th 2014 - System Center Advisor is now a part of the new Microsoft Azure Operational Insights - Click to learn more]

    This is the third installment of a Series (I don’t know yet how many posts they will be in the end, but I have at least 5 in mind at this point…) that walks thru the concepts of Microsoft Azure Operational Insights Search Syntax – while the full documentation and syntax reference is here, these posts are meant to guide your first steps with practical examples. I’ll start very simple, and build upon each example, so you can get an understanding of practical use cases for how to use the syntax to extract the insights you need from the data.

    In my first post I introduced filtering, querying by keyword or by a field’s exact value match, and some Boolean operators.

    In the second post I built upon the concepts of the first one, and introduced some more complex flavors of filters that are possible. Now you should know all you need to extract the data set you need.

     

    In this post we’ll look at how you can manipulate and have control over those results, once retrieved, by applying commands to transform them.

     

    Commands in Advisor Search MUST follow the vertical pipe sign “|”. A filter must always be the first part of a query string: it tells what data set you’ll be working on, and you “pipe” those results into a command. You can then further pipe them into another command and so on.

    This is loosely similar to the Windows PowerShell pipeline.

    [Start PowerShell digression]

    In general, Advisor Search language tries to follow PowerShell style and guidelines to make it ‘sound’ familiar to our ITPro audience, and ease the learning curve. Anyhow, Advisor Search is not identical to PowerShell for a number of reasons – mostly the fact that this is a specialized query language, not a general-purpose scripting language. In Advisor Search all we do is: we GET data. We can’t really call methods, don’t have functions, don’t have loops of flow control…none of that. Our use case is just: pulling some data we have previously collected, and shape it to some extent so that it tells me something more useful and lets me unlock insights.

    Yes, our data has ‘Types’, but we discussed in the first post how these are not real object types/classes – they are just a property on each record. There are no objects here – only data. Therefore, there are some things which we intentionally simplified from a full blown PowerShell syntax, given the more specialized use case of Search. Not having real object and types, we considered it superfluous and redundant to use the Verb-Noun command format. I.e. in Powershell you would use the Get-Process cmdlet and you’d get back actual .NET process objects you can interact with… but all our commands just work on pipeline input, which doesn’t have any type or ‘Noun’ – it’s just a ‘Result’ – and yes we ‘got it’, but there is no need to explicitly ask to ‘Get’ something – since that’s all that Search does anyway!

    We thought it would be stupid to force everyone to start their queries with Get-Result, for example, then followed by the actual filter. We thought we’d just start WITH the filter, which yes – it should GET you some ‘results’, obviously, and then we can ‘pipe’ those ‘results’ into a command to transform and shape them before presenting them to us. Since all commands deal with ‘results’ coming from pipeline input, there is no need for a Noun in the command name (all commands are applicable to all results, no strong types, remember?), and all we have in our commands are VERBS. Now, when it comes to choosing verbs, here we do stick to the PowerShell guidance and we think very hard before doing anything that doesn’t follow the PowerShell VERBS. We think this will empower you to intuitively tell what the ‘Sort’ or ‘Limit’ commands would do to the results, without looking up the documentation too often.

    [End PowerShell digression]

    Stefan had asked, but I could not answer thoroughly in 140 characters. Hence the long version.

     

    So, we were saying, after a pipe, we can use commands. And those have names of VERBS so you can tell what they do. So let’s test our theory and see if you can tell what certain commands would do. I’ll describe it below and see if it makes sense.

     

    The very first command I want to introduce is SORT.

    As you’d suspect, SORT allows you to define the sorting order by one (or multiple) fields. Even if you don’t use it, by default, we enforce a Time descending order (=most recent results are always on top). This means that when you run a search, say

    Type=Event EventID=1234

    what we really execute for you is

    Type=Event EventID=1234 | Sort TimeGenerated desc

    just because that is the type of experience you are used to with logs, i.e. event viewer in windows.

    Anyhow, you can use Sort to change the way we return results, i.e.

    Type=Event EventID=1234 | Sort TimeGenerated asc

    Type=Event EventID=1234 | Sort Computer asc

    Type=Event EventID=1234 | Sort Computer asc,TimeGenerated desc

    and so forth.

    This simple example in a nutshell gives you a feeling of how commands work: they change the shape of the results the filter got you in the first place.

     

    The second, less known, command, is LIMIT. Limit is the Powershell like verb, it is also supported to use TOP, which might sound familiar to some. They are identical behind the scenes. Imagine the scenario if you only want to know if there is ANY result at all for a given event – has it ever occurred? - , but you are not interested in how many they are. Looking at the most recent one is enough. Consider the syntax

    Type=Event EventID=2110 | Limit 1

    Type=Event EventID=2110 | Top 1

    Type=Event EventID=2110 | Top 1

    Note that while there were 988 records with that EventID: the fields / facets / filters on the left side of the screen always show information about the results returned BY THE FILTER PORTION of the query, the part before any pipe “|” character. Anyhow the ‘Results’ pane on the right only returns the most recent 1 result, since that is how we used a command to shape and transform those results!

     

    My favorite command, also lesser known, but very useful in a variety of situations, is SELECT.

    SELECT behaves like Select-Object in PowerShell: it gives you filtered results that don’t have all their original properties (which again you will still see in facets) but it will ‘select’ only the properties you specify.

    Example to try:

    Type=Event

    (now click ‘show more’ in one of the results and look at all the properties those results have)

    and then Select some of those explicitly

    Type=Event | Select Computer,EventID,RenderedDescription

    Type=Event | Select Computer,EventID,RenderedDescription

    This is particularly useful when you want to control output and only pick the pieces of data that really matter for your exploration, which typically isn’t the full record. It is also useful when records of different ‘Types’ have SOME common properties (but not ALL of their properties are common!) to produce an output that more naturally looks like a ‘table’ and will be useful / work well when exported to CSV and massaged in Excel.

    If you think this is useful, you might want to vote these ideas to make it even better http://feedback.azure.com/forums/267889-azure-operational-insights/suggestions/6519220-columns-in-search and http://feedback.azure.com/forums/267889-azure-operational-insights/suggestions/6519229-allow-resize-of-columns-in-table-view-for-aggregat .

     

    In the next post we’ll talk of our most powerful command: MEASURE. Stay tuned, and in the meantime: Happy searching!

  • musc@> $daniele.work.ToString()

    Azure Operational Insights Search How To: Part II – More on Filtering, using Boolean Operators, the Time Dimension, Numbers and Ranges

    • 1 Comments

    [Edited October 27th 2014 - System Center Advisor is now a part of the new Microsoft Azure Operational Insights - Click to learn more]

    This is the second installment of a Series (I don’t know yet how many posts they will be in the end) that walks thru the concepts of Microsoft Azure Operational Insights Search Syntax – while the full documentation and syntax reference is here, these posts are meant to guide your first steps with practical examples. I’ll start very simple, and build upon each example, so you can get an understanding of practical use cases for how to use the syntax to extract the insights you need from the data.

    In my first post I introduced filtering, querying by keyword or by a field’s exact value match, and some Boolean operators. If you have not read that yet, please do, then come back to this one.

    In this second post we’ll build upon those concepts, and try some slightly more elaborate filters.

    So we left the other post with a query like

    EventLog=Application OR EventLog=System

    Since we haven’t specified additional filters, this query will return the entries for both event logs for ALL Computers that have sent such data

    EventLog=Application OR EventLog=System

    Clicking on one of the fields/filters will narrow down the query to a specific computer, excluding all other ones; the query would become something like

    EventLog=Application OR EventLog=System Computer=SERVER1.contoso.com

    which, as you’ll remember, given the implicit AND, is the same as

    EventLog=Application OR EventLog=System AND Computer=SERVER1.contoso.com

    and gets evaluated in this explicit order – look at the parenthesis

    (EventLog=Application OR EventLog=System) AND Computer=SERVER1.contoso.com

    Now, just like for the event log field, you can bring back data only for a SET of specific machines, by OR’ing them

    (EventLog=Application OR EventLog=System) AND (Computer=SERVER1.contoso.com OR Computer=SERVER2.contoso.com OR Computer=SERVER3.contoso.com)

    Similarly, this other query will bring back % CPU Time only for the selected two machines

    CounterName=”% Processor Time”  AND InstanceName=”_Total” AND (Computer=SERVER1.contoso.com OR Computer=SERVER2.contoso.com)

    and so forth.

     

    Now, it should be enough with Boolean operators.

    Let’s look at something else: with datetime and numeric fields, you can also search for values GREATER THAN, LESSER THAN OR EQUAL, etc – we use the simple operators  >, < , >=, <= , != for this.

    For example I can query a specific event log for just a specific period of time, i.e. the last 24 hours can be expressed with the mnemonic expression below

    EventLog=System TimeGenerated>NOW-24HOURS

    Sure, you can also control the time interval graphically, and most times you might want to do that,

    Time Controls and Selectors in System Center Advisor Search

    but there are advantages about including a time filter right into the query:

    1. it works great with dashboards where you can override the time for each tile this way, regardless of the ‘global’ time selector on the dashboard page (Stas already described why this is useful)
    2. it will be great once we have scheduling of queries to use in a monitoring fashion to periodically ‘keep an eye’ on certain things or KPI’s

    When filtering by time, keep in mind that you get results for the INTERSECTION of the two time windows: the one specified in the UI (S1) and the one specified in the query (S2).

    Intersection

    This means, if the time windows don’t intersect (i.e. UX is asking for ‘this week’ and the query is asking for ‘last week’) then there is no intersection and you get no results.

     

    Those comparison operators we used for the TimeGenerated field are also useful in other situations, for example with numeric fields.

    For example, given that Advisor Legacy Configuration Assessment’s Alerts have the following Severities: 0 = Information , 1 = Warning , 2 = Critical. You can query for both ‘warning’ and ‘critical’ alerts and exclude informational ones with this query

    Type=ConfigurationAlert  Severity>=1

     

    Last but not least, we support range queries. This means you can provide the beginning and the end of a range of values in a sequence. Example: Show me the Events from the Operations Manager event log where the EventID is greater or equal to 2100 but no greater than 2199 (these would be Health Service Modules errors mostly around connectivity issues with Advisor, BTW)

    Type=Event EventLog="Operations Manager" EventID:[2100..2199]

    Type=Event EventLog="Operations Manager" EventID:[2100..2199]

    [Note that for the range syntax you MUST use the ‘:’ colon field:value separator and NOT the ‘equal’ sign, enclose the lower and upper end of the range in square brackets and separate them with two dots ‘..’]

    And that’s all for this time around! Hoping you are learning something useful and applicable to your needs with this tutorial, and onto the next post in the series, where I will start looking at the “|” pipeline and begin exploring search commands!

    Till then, happy searching!

  • musc@> $daniele.work.ToString()

    Azure Operational Insights Search How To: Part I - How to filter big data

    • 4 Comments

    [Edited October 27th 2014 - System Center Advisor is now a part of the new Microsoft Azure Operational Insights - Click to learn more]

    With this blog post I am starting a series where I walk thru some concepts of the Microsoft Azure Operational Insights Search Syntax – the full documentation and syntax reference is here, but these posts are meant to guide your first steps with practical examples. I’ll start very simple, and build upon each example, so you can get an understanding of practical use cases for how to use the syntax to extract the insights you need from the data.

    The first thing to know that the first part of a search query (before any “|” vertical pipe character – of which we’ll talk in a future blog post) is always a FILTER – think of it as a WHERE clause in TSQL: it determines WHAT subset of data to pull out of the system, from the Big Data store. After all, Searching a Big Data store is largely about specifying the characteristics of the data we want to extract, so it is natural that a query would start with the WHERE clause.

    The most basic filters you can use are KEYWORDs – such as ‘error’ or ‘timeout’, or a computer name – this type of simple queries will generally return diverse shapes of data within the same result set. This is because we have different Types of data in the system – my 'query for ‘error’ in the screenshot below returned 100K ‘Event’ records (collected by the Log Management feature), 18 ‘Alerts’ (generated by Advisor Configuration Assessment) and 12 ‘ ConfigurationChange’ (captured by the Change Tracking Intelligence Pack):

    Types of System Center Advisor Search Results

    These are NOT really object types/classes: if you are familiar with OpsMgr, please try to FORGET all you know about Classes and Objects in SCOM! It’s much easier here: Type is just a tag, or a property, a string/name/category, that is attached to a piece of data.

    Some documents in the system are tagged as Type:Alert and some are tagged as Type:PerfHourly, or Type:Event... you get the idea.

    Each search 'result' (or document, or record, or entry) shows all the raw properties and their values for each of those pieces of data, and you can use those field names to specify in the filter that you want to retrieve only the records where the field has that given value.

    'Type' is really just a field that all records have, but it is for any practical use not different from any other field.

    Anyhow, by convention, we established that based on the value of the ‘Type’ field, that record will have a different 'shape' or form (different fields). Incidentally, Type=PerfHourly, or Type=Event is also the syntax that you need to learn to query for hourly performance data aggregates or events.

    [Note that you can use either a colon or a equal sign after the field name and before the value: Type:Event and Type=Event are absolutely identical in meaning, you can chose the style you prefer.]

    So, if the Type=PerfHourly records have a field called 'CounterName', you can write a query like Type=PerfHourly CounterName="% Processor Time"  

    this will give you only the performance data where the performance counter name is "% Processor Time".

    You can also be more specific and throw a InstanceName="_Total" in there (if you know Windows Performance Counters, you know what I am talking about).

    Also you can click on a facet and another field:value filter will be automatically added to your filter in the query bar – i.e. screenshot below shows you where to click to add InstanceName:’_Total’ to the query without typing

    Interacting with Fields / Filters / Facets in System Center Advisor Search

    Your query now becomes

    Type=PerfHourly CounterName=”% Processor Time” InstanceName=”_Total”

    Note that you DO NOT HAVE to specify Type=PerfHourly at all to get to this result. Since the fields ‘CounterName’ and ‘InstanceName’ (at the time of this writing) only exist on records of Type=PerfHourly, even just the query below is specific enough to bring back the exact same results as the longer, previous one

    CounterName=”% Processor Time” InstanceName=”_Total”

    This is because all the filters in the query are evaluated as being in AND with each other: effectively, the more fields you add to the criteria, the less and more specific/refined results you get.

    For example this query 
    Type=Event EventLog="Windows PowerShell"
    is identical to this query

    Type=Event AND EventLog="Windows PowerShell"


    and it will return all events that were logged in (and collected from) the 'Windows Powershell' eventlog in windows. If you add a filter multiple times (i.e. clicking repeatedly on the same facet), the issue is purely cosmetic: it might clutter the search bar but still returns the same identical results since the implicit AND operator is always there.

    You can easily reverse the implicit AND operator by using a NOT operator explicitly, i.e.:

    Type:Event NOT(EventLog:"Windows PowerShell")

    or (equivalent)

    Type=Event EventLog!="Windows PowerShell"
    this will return all events from ALL OTHER logs, that are NOT the 'Windows Powershell' log.

    Or you can use other Boolean operator, such as ‘OR’: the query below returns back records for which the EventLog is either Application OR System

    EventLog=Application OR EventLog=System

    With the above query you’ll get entries for BOTH logs in the same result set.

    While removing the OR (hence leaving the implicit AND in place) such as the following query

    EventLog=Application EventLog=System

    Will produce NO results – because there isn’t a event log entry that belongs to BOTH logs – each event log entry was written in just to one of the two logs.

    Easy.

    Till the next installment. I’ll try to keep a frequent pace.

  • musc@> $daniele.work.ToString()

    Useful Operational Insights Search Query Collection

    • 1 Comments

    [Edited October 27th 2014 - System Center Advisor is now a part of the new Microsoft Azure Operational Insights - Click to learn more]

    This is a living document that will be periodically updated to collect useful, well-known, or sample queries to use in the Search experience in Microsoft Azure Operational Insights.

    I dump new useful searches here as I come up with or stumble into them. Will keep this post periodically updated, so check it from time to time or subscribe to it.

    These are some of the queries I use in my own workspace’s dashboard

    My Dashboard in System Center Advisor

    I hope this will provide useful examples to learn from… but reminder the full query language reference is published here: https://go.microsoft.com/fwlink/?LinkId=394544 when you don’t understand why a given search magically works (or doesn’t) in your environment Smile

    They are grouped by broad categories that generally map to the Intelligence Pack that produces a specific ‘Type’ of data. We also have documentation on the Types we use in various intelligence packs and the meaning of their fields here http://msdn.microsoft.com/en-us/library/azure/dn884648.aspx

     

    General Exploration Queries

    Which Management Group is generating the most data points?
    * | Measure count() by ManagementGroupName

    Distribution of data Types
    * | Measure count() by Type

    List all Computers
    ObjectName!="Advisor Metrics" ObjectName!=ManagedSpace | measure max(SourceSystem) by Computer | Sort Computer

    List all Computers with their most recent data's timestamp
    ObjectName!="Advisor Metrics" ObjectName!=ManagedSpace | measure max(TimeGenerated) by Computer | Sort Computer

    List all Computers whose last reported data is older than 4 hours
    ObjectName!="Advisor Metrics" ObjectName!=ManagedSpace ObjectName!="Advisor Metrics" ObjectName!=ManagedSpace | Measure Max(TimeGenerated) as LastData by Computer | Where LastData<NOW-4HOURS | Sort Computer

    Note – the ObjectName!= filters in the three queries above is just a workaround to filter out some performance data whose target object in SCOM is NOT a ‘Computer’, hence will have a improper value in that field.

    Note#2 – if you see ‘duplicate’ computer names (the NETBIOS name and the FQDN for the same machine listed as distinct computer), this might be due to IIS Logs – see post here where I describe the issue with the ‘Computer’ field http://blogs.technet.com/b/momteam/archive/2014/09/19/iis-log-format-requirements-in-system-center-advisor.aspx . If you know you have *other* data for that computer for sure – not just IIS logs - you can then easily filter those out (another workaround) and the last query above now becomes

    Type!=W3CIISLog ObjectName!="Advisor Metrics" ObjectName!=ManagedSpace ObjectName!="Advisor Metrics" ObjectName!=ManagedSpace | Measure Max(TimeGenerated) as LastData by Computer | Where LastData<NOW-4HOURS | Sort Computer

     

    Alert Management

    Note – if you have been using System Center Advisor Preview, Type=Alert used to be wired to the alerts generated by Advisor Configuration Assessment scenario. With the introduction of ‘Alert Management’ Intelligence pack, which pulls up your Operations Manager Alerts into Operational Insights search, we have re-used Type=Alert for these ‘real’ and ‘reactive’ alerts, and have renamed the Advisor ‘legacy’ alerts (Configuration Assessment Alerts) to Type=ConfigurationAlert. You might need no update your saved searches.

    Alerts raised during the past 1 day grouped by their severity
    Type=Alert TimeRaised>NOW-1DAY | measure count() as Count by AlertSeverity

    Alerts raised during the past 1 day sorted by their repeat count value
    Type=Alert TimeRaised>NOW-1DAY | sort RepeatCount desc

    Alerts raised during the past 24 hours which are now closed
    Type=Alert TimeRaised>NOW-24HOUR AlertState=closed

    Last Modified time for Alerts raised during the past 24 hours which are now closed
    Type=Alert TimeRaised>NOW-24HOUR AlertState=closed | measure Max(TimeLastModified) by AlertName

    Critical alerts raised during the past 24 hours
    Type=Alert AlertSeverity=error TimeRaised>NOW-24HOUR

    Critical alerts raised during the past 24 hours which are still active
    Type=Alert AlertSeverity=error TimeRaised>NOW-24HOUR AlertState!=closed

    Sources with active alerts raised during the past 24 hours
    Type=Alert AlertState!=closed TimeRaised>NOW-24HOUR | measure count() as Count by SourceDisplayName

    Warning alerts raised during the past 24 hours
    Type=Alert AlertSeverity=warning TimeRaised>NOW-24HOUR

     

    Capacity (Aggregated Performance Data)

    All performance data
    Type=PerfHourly

    Average CPU utilization by Top 5 machines
    * Type=PerfHourly CounterName="% Processor Time" InstanceName="_Total" | Measure avg(SampleValue) as AVGCPU by Computer | Sort AVGCPU desc | Top 5

    Max CPU time used by HyperV by machine
    Type=PerfHourly CounterName="% Total Run Time" InstanceName="_Total"  ObjectName="Hyper-V Hypervisor Logical Processor" | Measure max(Max) as MAXCPU by Computer | Where MAXCPU>0

    CPU Utilization by VM/Virtual Core
    Type=PerfHourly ObjectName="Hyper-V Hypervisor Virtual Processor" CounterName="% Guest Run Time" NOT(InstanceName="_Total") | Measure Avg(SampleValue) by InstanceName

    Memory Utilization by VM/Virtual Core
    Type=PerfHourly ObjectName="Hyper-V Dynamic Memory VM" CounterName="Average Pressure" | Measure Avg(SampleValue) by InstanceName

    Top Hosts with Highest Core Utilization
    CounterName="% Core Utilization" Type=PerfHourly | Measure Avg(SampleValue) by Computer 

    Top Hosts with Highest Memory Utilization
    CounterName="% Memory Utilization" Type=PerfHourly | Measure Avg(SampleValue) by Computer 

    Top Hosts with Inefficient VMs
    CounterName="NumberVMOverUtilized" or CounterName="NumberVMIdle" or CounterName="NumberVMPoweredOff" Type=PerfHourly | Measure Avg(SampleValue) by Computer 

    Top Hosts by Utilization (mathematical average of CPU and Memory usage counters)
    CounterName="% Core Utilization" or CounterName="% Memory Utilization" Type=PerfHourly | Measure Avg(SampleValue) as CombinedCPUMemAvg by Computer 

     

    Log Management (Windows Events)
    This section contains a mix up of query scenarios. Each Windows log has its own flavor and adds some different, unique perspective about what the system is doing. In order for some of the queries below to work, you would have to first collect the necessary event log. Notice that, since the log management pipeline is one of the simplest and easiest from a functionality perspective, many of the examples around the ‘Operations Manager’ event log are actually useful to troubleshoot Advisor-related discovery and connectivity issue that might be preventing some of the other intelligence packs and scenarios from working. Have fun searching logs!

    All Events
    Type=Event

    Count of Events containing the word "started" grouped by EventID
    Type=Event "started" | Measure count() by EventID

    Count of Events grouped by Event Log
    Type=Event | Measure count() by EventLog

    Count of Events grouped by Event Source
    Type=Event | Measure count() by Source

    Count of Events grouped by Event ID
    Type=Event | Measure count() by EventID

    All Events with level "Warning"
    Type=Event EventLevelName=warning

    Count of Events with level "Warning" grouped by Event ID
    Type=Event EventLevelName=warning | Measure count() by EventID

    How many connections to Operations Manager's SDK service by day
    Type=Event EventID=26328 EventLog="Operations Manager" | Measure count() interval 1DAY

    Events in the Operations Manager Event Log whose Event ID is in the range between 2000 and 3000
    Type=Event EventLog="Operations Manager" EventID:[2000..3000]

    Operations Manager Event Log’s Health Service Modules events around connectivity with Advisor
    Type=Event EventLog="Operations Manager" EventID:[2100..2199]

    Operations Manager Event Log’s Health Service Modules errors around Type Space (=Configuration Data) Subscription Module (if these errors are frequent, Predictions in Capacity Intelligence Pack might be affected/unavailable)
    Type=Event EventID=4502 "Microsoft.EnterpriseManagement.Mom.Modules.SubscriptionDataSource.TypeSpaceSubscriptionDataSource"

    When did my servers initiate restart?
    shutdown Type=Event EventLog=System Source=User32  EventID=1074 | Select TimeGenerated,Computer 

    Did my servers shutdown unexpectedly?
    Type=Event  EventID=6008 Source=EventLog

    SQL Server was waiting on a I/O request for longer than 15 seconds
    EventID=833 EventLog=Application  Source=MSSQLSERVER

    Windows Firewall Policy settings have changed
    Type=Event  EventLog="Microsoft-Windows-Windows Firewall With Advanced Security/Firewall"  EventID=2008  

    On which machines and how many times have Windows Firewall Policy settings changed
    Type=Event  EventLog="Microsoft-Windows-Windows Firewall With Advanced Security/Firewall"  EventID=2008  | measure count() by Computer 

     

    The following are substitutes you can use to do get some of the same information as the ‘Antimalware Intelligence Pack’ by using Log Management (i.e. for logs you are pulling from WAD for Azure machines that don’t have the MMA agent installed)

    Computers with Microsoft Antimalware (SCEP/Defender/Essentials) installed
    Type=Event Source:"Microsoft Antimalware" | measure count() as Count by Computer

    Malware detections
    Type=Event Source:"Microsoft Antimalware" EventID:1116

    Computers with no signature update in the last 24 hours
    Type=Event Source:"Microsoft Antimalware" EventID:2000 | measure max(TimeGenerated) as lastdata by Computer | where lastdata < NOW-24HOURS

     

    Log Management (IIS Logs)

    All IIS Log Entries
    Type=W3CIISLog

    Count of IIS Log Entries by HTTP Request Method
    Type=W3CIISLog | Measure count() by csMethod

    Count of IIS Log Entries by Client IP Address
    Type=W3CIISLog | Measure count() by cIP

    IIS Log Entries for a specific client IP Address (replace with your own)
    Type=W3CIISLog  cIP="192.168.0.1" | Select csUriStem,scBytes,csBytes,TimeTaken,scStatus

    Count of IIS Log Entries by URL requested by client (without query strings)
    Type=W3CIISLog | Measure count() by csUriStem

    Count of IIS Log Entries by Host requested by client
    Type=W3CIISLog | Measure count() by csHost

    Count of IIS Log Entries by URL for the host "www.contoso.com" (replace with your own)
    Type=W3CIISLog csHost="www.contoso.com" | Measure count() by csUriStem

    Count of IIS Log Entries by HTTP User Agent
    Type=W3CIISLog | Measure count() by csUserAgent

    Total Bytes sent by Client IP Address
    Type=W3CIISLog | Measure Sum(csBytes) by cIP

    Total Bytes received by each Azure Role Instance    
    Type=W3CIISLog | Measure Sum(csBytes) by RoleInstance

    Total Bytes received by each IIS Computer
    Type=W3CIISLog | Measure Sum(csBytes) by Computer

    Total Bytes responded back to clients by each IIS Server IP Address
    Type=W3CIISLog | Measure Sum(scBytes) by sIP

    Total Bytes responded back to clients by Client IP Address
    Type=W3CIISLog | Measure Sum(scBytes) by cIP

    Average HTTP Request time by Client IP Address
    Type=W3CIISLog | Measure Avg(TimeTaken) by cIP

    Average HTTP Request time by HTTP Method
    Type=W3CIISLog | Measure Avg(TimeTaken) by csMethod

    [For more W3CIISLog search examples, also read the blog post I published earlier.]

     

    Change Tracking

    All Configuration Changes
    Type=ConfigurationChange

    All Software Changes
    Type=ConfigurationChange ConfigChangeType=Software

    All Windows Services Changes
    Type=ConfigurationChange ConfigChangeType=WindowsServices

    Change Type<Software> by Computer
    Type=ConfigurationChange ConfigChangeType=Software | Measure count() by Computer

    Total Changes by Computer
    Type=ConfigurationChange | measure count() by Computer

    Changes by Change Type
    Type=ConfigurationChange | measure count() by ConfigChangeType

    When was the most recent Change by Type?
    Type=ConfigurationChange | measure Max(TimeGenerated) by ConfigChangeType

    When was the most recent Change for each Computer?
    Type=ConfigurationChange | measure Max(TimeGenerated) by Computer

    List when Windows Services have been stopped
    Type=ConfigurationChange ConfigChangeType=WindowsServices SvcState=Stopped 

    List of all Windows Services that have been stopped, by frequency
    Type=ConfigurationChange ConfigChangeType=WindowsServices SvcState=Stopped | measure count() by SvcDisplayName

    Count of different Software change types
    Type=ConfigurationChange ConfigChangeType=Software | measure count() by ChangeCategory

     

    SQL Assessment

    Did the agent pass the prerequisite check (if results are present, SQL Assessment data might not be complete, you might want to check the RunAs account configuration http://technet.microsoft.com/en-us/library/dn818161.aspx )
    Type=SQLAssessmentRecommendation IsRollup=false FocusArea="Prerequisites"

    List of Focus Areas the Recommendations are categorized into
    Type=SQLAssessmentRecommendation  IsRollup=true | measure count() by FocusArea

    SQL Recommendation by Computer
    Type=SQLAssessmentRecommendation IsRollup=false RecommendationResult=Failed | measure count() by Computer

    How much gain can I get by computer if I fix all its SQL recommendations?
    Type=SQLAssessmentRecommendation IsRollup=false RecommendationResult=Failed | measure Sum(RecommendationWeight) by Computer

    SQL Recommendation by Instance
    Type=SQLAssessmentRecommendation IsRollup=false RecommendationResult=Failed | measure count() by SqlInstanceName

    SQL Recommendation by Database
    Type=SQLAssessmentRecommendation IsRollup=false RecommendationResult=Failed| measure count() by DatabaseName

    How many SQL Recommendation are affecting a Computer a SQL Instance or a Database?
    Type=SQLAssessmentRecommendation IsRollup=false RecommendationResult=Failed | measure count() by AffectedObjectType

    How many times did each unique SQL Recommendation trigger?
    Type=SQLAssessmentRecommendation IsRollup=false RecommendationResult=Failed | measure count() by Recommendation

    Prioritized Detail Recommendations for a monthly 'RecommendationPeriod' (replace with YYYY-MM as appropriate). Great recipe for Excel export.
    Type:SQLAssessmentRecommendation IsRollup=false RecommendationPeriod=2014-10 RecommendationResult=Failed | sort RecommendationWeight desc | Select RecommendationId,Recommendation,RecommendationResult,FocusArea,RecommendationWeight,Computer,AffectedObjectType,SqlInstanceName,DatabaseName

     

     

    System Update Assessment

    Missing Required Updates
    Type=RequiredUpdate | Select UpdateTitle,KBID,UpdateClassification,UpdateSeverity,PublishDate,Computer

    Missing Required Updates for server "SERVER1.contoso.com"
    Type=RequiredUpdate (UpdateSeverity=Critical and UpdateClassification="Security Updates" and Server="SERVER1.contoso.com") | Select Computer,UpdateTitle,KBID,Product,UpdateSeverity,PublishDate

    Missing Critical Security Updates
    Type=RequiredUpdate (UpdateSeverity=Critical and UpdateClassification="Security Updates") | Select Computer,UpdateTitle,KBID,Product,UpdateSeverity,PublishDate

    Missing Security Updates
    Type=RequiredUpdate UpdateClassification="Security Updates" | Select Computer,UpdateTitle,KBID,Product,UpdateSeverity,PublishDate

    Missing Update Rollups
    Type=RequiredUpdate UpdateClassification="Update Rollups" | Select UpdateTitle,KBID,UpdateClassification,UpdateSeverity,PublishDate,Computer

    Missing Updates by Product
    Type=RequiredUpdate | Measure count() by Product

    Missing Updates for a specific product ("Windows Server 2012" in the example)
    Type=RequiredUpdate Product="Windows Server 2012"

     

    Malware Assessment

    Devices with Signatures out of date
    Type=ProtectionStatus | measure max(ProtectionStatusRank) as Rank by DeviceName | where Rank:250

    Protection Status updates per day
    Type=ProtectionStatus | Measure count(ScanDate) interval 1DAY | Sort TimeGenerated desc

    Malware detected grouped by 'threat'
    Type=ProtectionStatus NOT (ThreatStatus="No threats detected") | Measure count() by Threat

     

    Configuration Assessment (Legacy Advisor Scenario)
    NOTE: For the legacy Advisor Configuration Assessment scenario, in addition to the old Silverlight screens, some data is also indexed in the new Search feature for exploration purposes. Records of Type=ConfigurationObject are indexed and updated every time an object is discovered (or re-discovered) by Advisor Configuration Assessment. There are also records of Type=ConfigurationObjectProperties that represent the properties of those objects. These are only inserted in the index when their VALUE has CHANGED since the previous known value Advisor had discovered till the previous discovery. This is somewhat similar to ‘Change Tracking’ Intelligence Pack, but less sophisticated. Also records of Type=ConfigrationAlert are indexed once those Configuration Assessment Alerts are fired (each time, even if it is a ‘repeat’ i.e. because the HealthService has restarted) on Advisor agents by Advisor Configuration Assessment Alert Rules you are not ignoring.

    All 'Advisor Managed' Computers that have reported Configuration Assessment data
    Type=ConfigurationObject ObjectType="Microsoft.Windows.Computer" | Measure count() by Computer

    All 'Advisor Managed' Computers that have reported Configuration Assessment data (alternate version)
    Type=ConfigurationObject ObjectType="Microsoft.Windows.Computer"  | Measure Max(TimeGenerated) by Computer

    Count of machines by Operating System
    Type=ConfigurationObject  ObjectType="Microsoft.Windows.OperatingSystem" | Measure count() by ObjectDisplayName

    All Property changes tracked by Advisor Configuration Assessment for Computer "OM54.contoso.com" (replace with your own computer name)
    Type="ConfigurationObjectProperty" RootObjectName="OM054.contoso.com"

    IP Address changes tracked by Advisor Configuration Assessment for Computer "OM54.contoso.com" (replace with your own computer name)
    Type="ConfigurationObjectProperty" Name="Microsoft.Windows.Computer.IPAddress" RootObjectName="OM054.contoso.com"

    Check SQL Collation settings for each database called "tempdb" on each SQL instance on each SQL server
    Type="ConfigurationObjectProperty" Name="Microsoft.SQLServer.Database.Collation" ObjectDisplayName="tempdb" | Select ObjectDisplayName, ParentObjectName, RootObjectName, Value

    Machines grouped by Organizational Unit
    Type="ConfigurationObjectProperty" Name="Microsoft.Windows.Computer.OrganizationalUnit" | Measure count() by Value | Where AggregatedValue>0

    All Alerts generated by Advisor Configuration Assessment 
    Type=ConfigurationAlert

    Worst Severity of Configuration Assessment Alerts by Computer
    Type=ConfigurationAlert | measure Max(Severity) by Computer 

    Configuration Assessment Alerts grouped by Rule/Monitor that generated them
    Type=ConfigurationAlert | measure count() by WorkflowName 

    Configuration Assessment Alerts for ‘SQL Server’ workload
    Type=ConfigurationAlert Workload=“SQL Server”

    Active Machine-Generated Recommendations for 'Windows' (or 'SQL Server') Workloads
    Type=Recommendation RecommendationStatus=Active AdvisorWorkload=Windows
    Type=Recommendation RecommendationStatus=Active AdvisorWorkload="SQL Server" 

    Active Machine-Generated Recommendations grouped by Computer
    Type=Recommendation RecommendationStatus=Active | Measure count() by RootObjectName

    List Active Directory Sites (based on computers that had that changed)
    Type=ConfigurationObjectProperty Name="Microsoft.Windows.Computer.ActiveDirectorySite" | Measure count() by Value

    Which machines have the most memory assigned (and that has changed - probably you will only have data for VMs with dynamic memory most of the times with this query)
    Type=ConfigurationObjectProperty Name="Microsoft.Windows.OperatingSystem.PhysicalMemory" | Measure Max(Value) by RootObjectName

       

                   

    Other searches on blogs

    Stan has some useful ones mainly around System Update and Malware Assessments in this post http://cloudadministrator.wordpress.com/2014/10/19/system-center-advisor-restarted-time-matters-in-dashboard-part-6/ and about SQL Assessment in this other one http://cloudadministrator.wordpress.com/2014/10/23/microsoft-azure-operational-insights-preview-series-sql-assessment-part-7/

    For more W3CIISLog search examples, I also posted another blog post here.

    For more Windows Event searches around IIS errors (cloned from the IIS MP) in another blog post here.

    I am also putting out a series of posts that guide you to take your first steps with the search syntax:

    1. http://blogs.msdn.com/b/dmuscett/archive/2014/10/19/advisor-search-first-steps-how-to-filter-data-part-i.aspx
    2. http://blogs.msdn.com/b/dmuscett/archive/2014/10/19/advisor-search-how-to-part-ii-more-on-filtering-using-boolean-operators-and-the-time-dimension.aspx
    3. http://blogs.msdn.com/b/dmuscett/archive/2014/10/19/advisor-search-how-to-part-iii-manipulating-results-the-pipeline-and-search-commands.aspx
    4. http://blogs.msdn.com/b/dmuscett/archive/2014/10/29/operational-insights-search-how-to-part-iv-introducing-the-measure-command.aspx
    5. http://blogs.msdn.com/b/dmuscett/archive/2014/10/29/azure-operational-insights-search-howto-part-v-max-and-min-statistical-functions-with-measure-command.aspx
    6. http://blogs.msdn.com/b/dmuscett/archive/2014/10/31/azure-operational-insights-search-how-to-part-vi-measure-avg-and-an-exploration-of-type-perfhourly.aspx
    7. http://blogs.msdn.com/b/dmuscett/archive/2014/11/10/azure-operational-insights-search-hot-to-part-vii-measure-sum-and-where-command.aspx
  • musc@> $daniele.work.ToString()

    W3C IIS Logs Search in Microsoft Azure Operational Insights

    • 0 Comments

    [Edited October 27th 2014 - System Center Advisor is now a part of the new Microsoft Azure Operational Insights - Click to learn more]

    Last week we enabled the collection of IIS Logs from your Operations Manager agents into System Center Advisor. if you are already using Advisor, you’ll notice we don’t just talk about how many ‘events’ are there, on the Log Management tile, anymore, but we rephrased it to ‘Records’. This is a small but important tweak, as we will be adding more and more configurable data sources.

    If you are on Advisor Preview, and you have an opinion on which data sources we should be adding (com’on – I know you have an opinion!), then from the Advisor Portal, click on the ‘Feedback’ button and then use this link to see what ideas are already in the backlog http://feedback.azure.com/forums/267889-azure-operational-insights/category/88086-log-management-and-log-collection-policy or add/suggest the ones you would like to see!

    Log Management Overview Tile

    And if you are not yet using Advisor – head to our Onboarding Instructions page and Try it out!

     

    Back to the newly released feature (W3C IIS Logs collection and search), once you have an Advisor account, just follow what Joseph blogged about in order to configure IIS log collection http://blogs.technet.com/b/momteam/archive/2014/09/19/collect-amp-search-iis-log-in-advisor.aspx

    Once you have some data collected and you drill into the Log management page, we now have a breakdown by type (you see where this is going) and then specialized blades with other breakdowns by event log, by URI, and sample searches ready to use. 

    Log Management Drilldown page

     

    So, let’s delve into Search. The most basic search you can write for IIS logs would be clicking on the first blade ‘Log Types’ – in the screenshot it says I have a count of 222 ‘W3CIISLog’ records in the last 24 hours. Let’s click on that, which lands me to search with this query

    Type=W3CIISLog

    This will bring back all records. Notice that once we land in the search page, the default time interval is now 7 days, not 24 hours anymore like in the page you came from.

    Type=W3CIISLog

    Nice, but I now want to get a breakdown of these log entries by client IP Address, and see which one downloaded (received) the most data from our sites/servers.

    Easy! Using our Measure command with the Sum() statistical function! I add a vertical pipe “|” character after the query filter and my measure command

    Type=W3CIISLog | Measure Sum(scBytes) by cIP

    Type=W3CIISLog | Measure Sum(scBytes) by cIP

    How did I know the field name? Well, the facets/filters on the left end of the screen also show distribution of various field’s values in those log entries, and the entries themselves can be explored/viewed to look at the field names. For IIS specifically, the field names we use are slightly modified versions of the field names in the original IIS log, because we preferred not having dashes in the names, so we went with camelCase: if the original IIS field name was ‘cs-host’ it now becomes ‘csHost’; ‘s-ip’ becomes ‘sIP’ and so forth. You should be able to mentally map them fairly easily.

    As for the search query syntax, you can find it on TechNet https://go.microsoft.com/fwlink/?LinkId=394544

    Now, by looking at my facets above, I notice 2 requests with the ‘options’ method: normal web site visits wouldn’t be using that method on my server, so this must be someone scanning/fingerprinting the server (only my server, or maybe someone is scanning and entire network to see which IPs have webservers of some sort running. So let’s click on the facet, and the query becomes

    Type=W3CIISLog  csMethod=options | Measure Sum(scBytes) by cIP

    and this brings down our query to the one IP address issuing those requests:

    Type=W3CIISLog  csMethod=options | Measure Sum(scBytes) by cIP

    What I find very handy, when looking at traffic logs, is to know my visitors: who are those IP addresses? That’s why I normally have the whois.exe utility from Windows Sysinternals handy (or you can use an online whois service)

    whois.exe utility from Windows Sysinternals and Advisor Search

    So we know who this scan came from. But what else did this IP do? Let’s drill into the IP address, and remove the filter for the method – to see every request from that IP (if this was part of a larger scan)… in this case, we found no other activity from that address. We can’t really do anything in this case about it, and we move on. But now you could go back a couple of steps (using the query history on the right end side of the search screen – Tip: toggle it with the ‘clock’ icon) and continue investigating what the next client IP did, and so forth.

    I hope I gave you a sense of how to move around W3CIISLogs in a security-type investigation.

     

    What about troubleshooting scenarios for a website/webserver?

    I could get a breakdown of requests by HTTP status code the server has returned

    Type=W3CIISLog  | measure count() by scStatus

    Type=W3CIISLog  | measure count() by scStatus

    and lets’ see I want to start investigating what those ‘500’ errors were…a few clicks, a few changes to my query it becomes

    Type=W3CIISLog   scStatus:500  csHost:"www.muscetta.com" | measure count() by csUriStem

    Type=W3CIISLog   scStatus:500  csHost:"www.muscetta.com" | measure count() by csUriStem:

    which shows me that (based on the facets) 14 IP addresses have been getting ‘500’ back on the wordpress comments page – so either my comments don’t work, or these were spam attempt that were blocked, and with some other twist of the query, I can see which actual blog posts on my site the comments were meant to be for

    Type=W3CIISLog   scStatus:500  csHost:"www.muscetta.com"  csUriStem:"/wp-comments-post.php"  | measure count() by csReferer

    Type=W3CIISLog   scStatus:500  csHost:"www.muscetta.com"  csUriStem:"/wp-comments-post.php"  | measure count() by csReferer

    And I can check how many unique IP addresses are being failing to post comments

    Type=W3CIISLog   scStatus:500  csHost:"www.muscetta.com"  csUriStem:"/wp-comments-post.php"  | measure count() by cIP

    Type=W3CIISLog   scStatus:500  csHost:"www.muscetta.com"  csUriStem:"/wp-comments-post.php"  | measure count() by cIP

     

    These are just some very basic examples to get you warmed up and give you a sense of what you can do and how you can interact with the logs – Have fun searching your own W3C logs, and let us know what you think of Advisor by going to the ‘Feedback’ button.

  • musc@> $daniele.work.ToString()

    System Center Advisor has kept me busy and you should check it out

    • 0 Comments

    [Edited October 27th 2014 - System Center Advisor is now a part of the new Microsoft Azure Operational Insights - Click to learn more]

     

    This blog has been quiet for over 8 months… in the meantime, several folks (in Microsoft, and from outside) have reached and keep reaching out to me for APM-related questions.
    Sorry, I don't work nor own that feature anymore. In fact I have not really worked on it for over a year. Even my previous post about AppInsights and future speculations… I was already not working on it anymore (albeit I have worked on AppInsights in the early days when it was still codenamed), but I had to bid it farewell, and that’s what that post was.

    I have instead been busy with System Center Advisor  in the last 16 or so months. First small but useful things, then the complete overhaul we did the past May at TechEd North America 2014.

    If you have not yet heard about it and have no clue what I am talking about, then you should definitely check it out. See the following resources if you want to learn more of what I am working on:

    clip_image001VIDEOS

    Advisor Preview 2min Overview Video: http://aka.ms/unrpst

    Advisor Preview TechEd announcement Video: http://aka.ms/Aulpqc

    Joseph @ The Edge Show http://aka.ms/R4p9d0

    Advisor Preview Onboarding Steps Video: http://aka.ms/Lgt2zu 

    clip_image002SOCIAL

    Advisor Preview Twitter Handle: @mscAdvisor

    clip_image003RESOURCES

    Advisor Preview Onboarding Documentation: http://aka.ms/Wrbzug

    Advisor Preview Troubleshooting blog: http://aka.ms/G04tcq

    Advisor Preview Feature requests can me made inside the Advisor portal by clicking the ‘Feedback’ link Advisor Feedback

  • musc@> $daniele.work.ToString()

    Microsoft Monitoring Agent, System Center Operations Manager and Visual Studio Application Insights

    • 4 Comments

    Since the release of System Center 2012 R2 Preview (and more after GA was announced) a lot of people asked me why did we rename the Operations Manager agent to "Microsoft Monitoring Agent"? Some information that went out together with the GA of SC2012R2 can be found at the following link: http://technet.microsoft.com/en-us/library/dn465154.aspx    
    And here is a post from Marnix, System Center MVP http://thoughtsonopsmgr.blogspot.com/2013/09/scom-2012-r2-hello-mma-microsoft.html    
    Essentially, Microsoft Monitoring Agent is not *only* the SCOM agent  anymore - the agent is now licensed with System Center OR with Visual Studio. When it was first released, it could already be used when reporting to SCOM (for monitoring), and it could also be used for standalone IntelliTrace collection (diagnostics scenario, more geared towards Dev/App owners). Read more in these other blog posts by Larry: Introducing Microsoft Monitoring Agent and Performance Details in IntelliTrace.

    Enter ‘Application Insights’

    With Microsoft Monitoring Agent 2013 Update Rollup 1 (at the time of this writing available as a preview), Microsoft Monitoring Agent can now also be used to report APM data to the brand new Application Insights Preview feature in Visual Studio Online that was announced a couple of weeks ago. Application Insights is an Azure-backed SaaS solution allowing teams to “[…] get insights from monitoring and going back and make the app better. Some people call it DevOps [...] but it's a sort of holistic view of the application: 360 degrees telemetry data like usage, performance, exception data, crash data, all that you need to have in live site to know how well your application is doing.[…]” (see the complete interview to Soma here).

    You can read more also on
    http://blogs.msdn.com/b/somasegar/archive/2013/11/13/visual-studio-2013-launch-announcing-visual-studio-online.aspx
    http://blogs.msdn.com/b/visualstudioalm/archive/2013/11/13/announcing-application-insights-preview.aspx    
    Application Insights 360 Dashboard

    So what powers some (but not all) of the data that you have at your fingertips in Application Insights – like you might have imagined - is the APM agent within MMA: the same APM agent you can use with OpsMgr. And in Application Insights you’ll see the same familiar data you see in OpsMgr such as exceptions and performance events (which can be exported to IntelliTrace format), and performance counters, but this time in a multi-tenant SaaS solution specifically designed for DevOps teams.

    APM Events in Application Insights

    MMA 2013 UR1 Preview is available as a standalone download from Microsoft Download Center (as well as from the Application Insights Preview itself, within the Visual Studio Online portal) and it is the first version of the agent that allows the agent to connect to both on-premises System Center OpsMgr systems as well as to the SaaS service.
    http://www.microsoft.com/en-us/download/details.aspx?id=41143
    http://msdn.microsoft.com/en-us/library/dn481094.aspx
    Microsoft Monitoring Agent - Select Connect to Application Insights

    NOTE: Keep in mind that at the time of this writing, this is a CTP (“Preview”) release of the agent. It is not supported by CSS for non-Visual Studio Online-related scenarios. Even though we are not currently aware of any major compatibility issues between this CTP and SCOM (or when multi-homing between Application Insights and SCOM), only very limited testing was done for this agent working together with SCOM at this stage. We encourage SCOM customers NOT to use it in their production environments and wait for the final Microsoft Monitoring Agent 2013 Update 1 release.

    In the future, anyhow, dual homing could be used to let your agent differentiate what data to send to which solution: i.e. send only alerting and performance information necessary for monitoring and triaging production issues to the on-prem System Center Operations Manager system, while the detailed and much more verbose code-level information can be sent for developers to consume in our multi-tenant SaaS APM offering within Team Foundation Service online (so you don’t have to worry about managing extra storage for APM data in the SCOM database), or to SCOM (maybe only data from some application, i.e. ‘PROD’ applications), or to both systems in various combinations – based on environment, project, operational model, processes and teams/ownership… It is for example very practical to use Application Insights to conduct functional and load tests in development and test environments – without the need to stand up another OpsMgr infrastructure, or to affect the scale and performance of the one that is designed to handle ‘prod’ data – to feed rich and actionable diagnostic information into the development lifecycle, to improve those applications even before they go in production.

    Maarten, one of the System Center MVP’s, has also started a series of post on Application Insights where he started sharing his perspective about the powerful hybrid monitoring scenarios that have been enabled when using Microsoft Monitoring Agent with Application Insights and with System Center 2012 R2.

    APM for Azure PaaS

    Added benefit - MMA, when used with Visual Studio online, can also be installed in Azure Cloud Services's instances (PaaS) - which was not a supported scenario in System Center (see this post where I mentioned this before). This is the first time we are able to offer true APM monitoring for Azure PaaS. In OpsMgr, agents are uniquely identified by their FQDN (Fully Qualified Domain Name), and everything in SCOM from connector to DB to DAL to SDK - all these components rely on agent names. Machine names in most corporate networks are well-defined pieces of information, follow a logical naming convention, and rarely change. SCOM Management Servers also rely on Kerberos/AD on premises and/or certificates (again using the FQDN) to authenticate the agents, and expect to only be talking to ‘well known’, pre-authorized, machines. But with Azure Cloud Services (PaaS) and IaaS (in certain configurations) you can have any number of cloned, identical, elastic instances of 'roles' (worker and web) deployed and running at any given time which appear and disappear as you scale them up and down. Machine names don't last and don’t matter much in Azure PaaS like they do on-prem... and it is much more natural to have Azure send APM data to Azure, not to on-prem - which would otherwise require opening inbound ports in your perimeter - remember the agent initiates connection to the infrastructure it reports to, be it a SCOM Management Group, or Application Insights in the cloud. The agent includes a brand new connector that can talk web-friendly protocols to report to the SaaS offering, which is a very different backend than an OpsMgr Management Server/Group. Application Insights uses a newly built backend running in Azure, written with Cloud-First principles. The way to authenticate to the service is thru an ‘application key’ (which represents the Application Insights tenant); the cloud service does not use the machine’s identity. You place the application key in the application’s configuration file, and Visual Studio allows you to package/bundle a script to silently install and configure the agent automatically, so that every time your PaaS roles are re-deployed, you will have the agent installed on it. Machines come and go, applications stay, and they need to be monitored – those applications and their lifecycle are what Application Insights and Visual Studio Online are all about. For infrastructure-level info you don’t need an agent, instead: from System Center Operations Manager, you can of course keep using the Azure Management Pack, which polls from the Azure Management API and does a better job to create/dispose of those ‘elastic’ objects that come and go (thru discoveries); if you are only in the cloud (=no on-prem infrastructure) you can find that type of OS-level info (CPU/Memory/Disk) in Azure Portal.

    Availability Monitoring

    Availability information (and other metrics such as external response time) that is tracked in Application Insights comes from synthetic tests providing an ‘Outside In’ perspective: single URL probes or Visual Studio webtests. If you are one of my OpsMgr readers, you would have probably understood this is backed by Global Service Monitor – the same service, offering ‘watcher nodes in the cloud’, that you can attach to OpsMgr.

    More than just APM (as we knew it in System Center)

    More explicit instrumentation can be added to apps in various ways, when reporting to Application Insights. These include:

    • Client side Usage monitoring : Client-side monitoring instrumentation in Application Insights is a completely different solution than the one in OpsMgr. First, the focus is on usage, visitors, and their experience – but more in the analytics sense, rather than with the alerting angle of the one in OpsMgr. Second, enablement is different: Application Insights provides you with a javascript snippet that can be added to any website, also if not .Net – unlike in OpsMgr where .Net server-side monitoring is used to hook up automatic injection of javascript – but the change must be done by a developer. The manual method in the end proves more compatible with many applications and browsers.
    • Server SDK's by which you can instrument logging for custom metrics in your code and have it report to the service directly
    • Client SDK for Windows Phone 8 apps by which you can instrument logging for custom metrics in your code and have it report to the service
    • Deployment information can be collected – see post from Charles – this is extremely useful to understand if changes in performance or reliability are related with deployments of new versions of the app/service
    • Beautiful Customizable Dashboards and a fresh, modern UI on top of all this data

    How can I try it out?

    Application Insights is currently in preview and you need an invitation to try it out. You might want to go to www.visualstudio.com and register for a VSO subscription and add yourself to the waiting list by clicking the blue “Try Application Insights” button.

    Some more links

    Series of videos on how to use it http://channel9.msdn.com/Series/Application-Insights-for-Visual-Studio-Online     
    Forum on MSDN http://social.msdn.microsoft.com/Forums/vstudio/en-US/home?forum=ApplicationInsights
    Documentation on MSDN http://msdn.microsoft.com/en-us/library/dn481095.aspx

  • musc@> $daniele.work.ToString()

    Programmatically create APM objects and configuration (w/ APM Explorer sample app)

    • 0 Comments

    I have been speaking to multiple customers, and a lot of them had the same feedback: “the APM template/wizard is great, BUT what if I want to automate enablement of monitoring when I provision new applications, without using the UI ?”. The request seems fair, but our extensibility/programmability story for APM currently doesn’t easily allow that.

     

    The APM template, like all templates, generates a management pack (or adds “stuff” to an existing management pack). Many other templates actually create classes/discoveries/rules/monitors… but APM provides a lot of settings which are really peculiar to its functionality, and don’t easily fit into the “standard” management packs/discoveries/rules/monitors pattern. What the APM template does is really to capture INTENT, and use that information to generate the right configuration on the agent.

    Sure, it still creates an MP, and it does create an object (<EntityType>) for the “application group” you are defining, within that MP. If you are wondering what an “application group” is, you might want to refer to this previous post of mine that explains more at a high level what objects are created by APM first: http://blogs.technet.com/b/momteam/archive/2012/01/14/apm-object-model.aspx then come back here.

    It also creates a discovery for the application group. What the APM discovery has that is really special, is its data source configuration, which features a “special sauce” you can see below (from an exported MP in my demo environment):

    AppChunk APM Data Source Config

    Is XML the one I see within the <AppChunk1> tag? It surely is, encoded XML, nested within the “outer” Management Pack’s XML… but XML nonetheless. For the non-programmers here, those &gt; and &lt; are encoded versions of open and closed tags: “>” and “<”. It gets encoded this way in order to have XML within other XML… that is because technically, the whole AppChunk1 is just a string – that is right: the MP Schema has NOT been extended to support special APM “stuff” – it all is just a string that gets used by a data source module as configuration. This configuration happens to be quite complex, but the MP will validate and import even if you write something incorrect in this string. BUT then the discovery module will choke on this input and fail (and raise an alert on the management server). Since “normally” the MP gets only written by our official UI/Template, then this is not an issue because we guarantee we will write it correctly (if not, we likely have a bug somewhere).

    But this is the reason why it isn’t supported to edit it outside of the template: because it is a non-trivial exercise, and it is easy to get it wrong without having a public schema to validate against. Also, since the AppChunk1 module configuration is not in the MP Schema, a future version of the module and template might change the way this piece of XML looks like.

    So, with the warning that all that I am going to write starting now is TOTALLY UNSUPPORTED, I will show you how to look at what the template builds, and try to replicate it. I won’t use any “insider knowledge” nor code officially released by Microsoft nor part of any product: I will just guide you thru looking at the XML output and try to make sense out of it. When I did this myself, I came up with some SAMPLE CODE  which I will provide, which can generate the same XML.

    Sounds easy enough, so let’s take a look at this <AppChunk1> and look at it once you add some carriage returns to make it more readable:

    AppChunk1

    I highlighted a few different blocks in it:

    • some global settings about the application group (name, environment tag, a unique GUID)
    • server-side monitoring settings (global)
    • client-side monitoring settings (global)
    • application-component-specific configuration (which application components to enable for monitoring for – this is essentially the list of “application components” within the “application group” (again, refer to this blog post for the object model and terminology)

    Most of the settings are self-explanatory, when you look at them… you will recognize they are all the same things that you configured in the template: namespaces, thresholds, etc…

    So with this knowledge, what does it take to create the same XML?

    It takes some code, of course. Some sample code is what I am going to provide in this post, linked below: I built a small sample application to demonstrate this. It is built with Visual Studio 2010 and compiled to run against .NET Framework 4.x. The sample application will let you connect to a management server (I only tested it against SC 2012 SP1) and will list all he applications you have (that have been discovered) and will show whether they are already enabled for monitoring, in which MP, and some of the settings applied to them. Please note that this is just an EXAMPLE, so it has not gone thru full testing and there is no guarantee that it will keep working with future updates. It also doesn’t understand nor show things like group scoping of the template, nor if the same application has been configured more than once in multiple templates, etc – in fact it might even be broken in some of those scenarios, as I have not done extensive testing!

    APM Explorer GUI

     

    To be totally fair and give due credit, this first part of the application got started by my friend Eric, when he was trying to figure out which applications he had configured and which ones he was still missing. So the first part of the code, which enumerates your endpoints, is actually coming from him. I eventually ended up writing a SQL query and SSRS report for THAT scenario (see here http://blogs.technet.com/b/momteam/archive/2012/08/22/apm-configured-endpoint-report.aspx ) but then re-used the GUI for this “configuration” experiment, instead.

    So back to configuration, the GUI is only meant to be a quick way to multi-select various “application components” from your inventory, and quickly create an MP to monitor them with APM:

    Right click Configure APM

     

    This brings up a form with the same basic settings in the template (note not ALL the settings available in the template have been implemented in my sample!)

    Application Group Settings

     

    When you click OK, a management pack will be written to disk (in the same folder where you launched the EXE from).

    This is a totally “importable” MP that should be working and creating all the right things that the template would have created, at least for the server-side settings.

     

    The tool isn’t extremely useful as-is (because you still have to go thru a UI, after all! – and if you have to use a UI, you might just as well use the official one that ships with the product!)… but that is not the point! The main goal is to show that it is possible to create the right XML thru custom code, to automate provisioning of APM monitoring for your apps. I built a UI on it to let you validate how it works. But eventually, if you want to automate enablement of monitoring for your own apps, you will only really care about the “APMMPGenerator” class, which is the one that does the “dirty” work of creating the MP!

    As I wrote earlier, I didn’t use any insider knowledge, and this code is absolutely NOT the same code as to what is used within the product itself – it just happens to produce the same XML fragment as output. To further prove that this could be done by anyone, I purposely kept the code NOT elegant: by this I mean that I didn’t treat XML as such nor used classes in the framework to deal with XML like a true programmer would have done, no schema validation, nothing of that sort! Instead, I hacked together the required XML by using quick and dirty STRING manipulation and token replacement. While most real programmers will be probably thinking this code should be posted on TheDailyWTF, I stand behind my choice, and I believe many IT Pro’s and operations manager administrator that don’t write code every day will actually appreciate it and find it more readable, and probably easier to port to PowerShell, Perl, or their favorite scripting language. APMMPGenerator is the main class in this sample code that is relevant to learn how to write the required pieces of the MP:

    APMMPGenerator class

     

    This class and its methods are heavily commented ‘step by step’, and it will show how you can generate the XML for a management packs to be used with SCOM to enable APM.

    Writing XML for management packs thru code (and concatenating strings) is in my opinion a very powerful technique that I have also used in the past to build other MPs that were “repetitive” (i.e. needed to contain many “similar” groups, rules, monitors, etc), and should allow people to more easily port it to other languages (i.e. PowerShell, for automation, sounds like a good choice…).

     

     

    Gotcha’s / Disclaimer

    • Client-side monitoring settings cannot be created by just writing XML “offline” and “outside” of an OpsMgr Management Group, because the real APM template creates a binary RunAs Account that is used as an encryption key to secure the browser-to-server communication (so that random attackers cannot just feed bad data to the CSM Collector endpoint, but they need to be valid/real browsers doing that). This is something that has to call the SDK on the management server, to see if such an account already exists or not, etc… it gets a lot trickier and it is simply not possible, with the current design to create such part of the MP “offline” just by crafting XML. This said, once the MP is imported, you can go and EDIT it again in the template, add client-side monitoring, apply/save, and the right things should happen there and then.
    • The tool’s code does NOT create a FOLDER and VIEWs for the monitored applications. I left that as an exercise for the reader. if you look at the views that the template creates, there really isn’t anything too special about them – they are just standard views, like those in any other MP. Hence I didn’t spend time there… there are examples on how to add views here http://msdn.microsoft.com/en-us/library/bb437626.aspx and here http://msdn.microsoft.com/en-us/library/bb960509.aspx (among other places…). Like the above, editing the template after the fact should add the views at that point, when saving.
    • Other than the above “AppChunk1”, there are a few more things that the class creates but I didn’t describe: things like references to other MPs, display strings, and info required to make the “template instance” appear in the “Authoring” pane of the console, so it can be further edited later on. I am not describing those since they are all “standard” Management Packs elements… documentation on MSDN, like for the views, above.
    • All of this (tool, sample code, post) is TOTALLY NOT SUPPORTED. I repeat it: NOT SUPPORTED. I am not encouraging anybody to use this! The only supported way to do this stuff is to use the Template, which is what has been written by professional developers and tested. What I did here is to put myself in the customers’ shoes, look at what the template builds, and tried to replicate it. I didn’t use any “insider knowledge” nor code owned by Microsoft to do this – I did what anyone of you could have done: observe the MP, and try to build one that looks the same. Call it reverse engineering, if you wish. Anyway, since some people have expressed the need to automate enablement of monitoring… this is the only way I can think of enabling that with the current product. I know. There are plenty of smart people in the OpsMgr community, who don’t' get scared about creating custom solutions on top of the platform. This is a post for them.
    • All of the above is not supported. No, really. Just in case you missed it.  If you really want to use it, please evaluate in your test environment first! As expected, this solution is provided AS-IS, with no warranties and confers no rights. Future versions of this tool may be created based on time and requests.

     

  • musc@> $daniele.work.ToString()

    Lonely blog for almost a year, and see you at MMS 2013 next week

    • 0 Comments

    Wow, I haven’t written here in a while. My last post on this blog is from over a year ago, and referred to the BETA of System Center 2012 Service Pack !

    Since then, the final version of that Service Pack 1 has shipped and Global Service Monitor has been made general available too - http://blogs.technet.com/b/momteam/archive/2013/01/15/system-center-2012-sp1-operations-manager-is-generally-available.aspx

    I have not really been completely silent, tho – just on this blog. With regards to SP1, I have recorded a short presentation for Microsoft Virtual Academy about what is new in SP1 http://technet.microsoft.com/en-US/video/JJ873818 – if you are coming at MMS 2013 next week, you will hear a lot more about these enhancements.

    I also blogged a few technical posts on the momteam blog, a few of those posts here in case you missed them:

    APM Configured Endpoint report
    http://blogs.technet.com/b/momteam/archive/2012/08/22/apm-configured-endpoint-report.aspx

    Event-to-Alert ratio, reviewing problems and understanding trends for APM data in OpsMgr 2012
    http://blogs.technet.com/b/momteam/archive/2012/06/18/event-to-alert-ratio-reviewing-problems-and-understanding-trends-for-apm-data-in-opsmgr-2012.aspx

    APM Agent Throttling settings and other APM Overrides in SC2012 Operations Manager
    http://blogs.technet.com/b/momteam/archive/2012/12/19/apm-throttling-settings-and-other-apm-overrides-in-sc2012-operations-manager.aspx

    I also kept updating and fixing bugs in MPViewer and  OverrideExplorer – for which I keep updating always the same post here http://blogs.msdn.com/b/dmuscett/archive/2012/02/19/boris-s-tools-updated.aspx

    I have also been busy with a couple of personal projects, such as restoring a mis-treated guitar I got in a thrift store ( http://www.muscetta.com/2013/01/21/restoring-an-electric-guitar/ ) an building one (almost) from scratch ( http://www.flickr.com/photos/dani3l3/sets/72157632658946681/ ).

     

    I will be at MMS 2013 next week, and you can catch me at a couple of different sessions:

    IM-B202 System Center 2012 SP1 Operations Manager Overview  - Tuesday, April 9 8:30 AM - 9:45 AM South Seas B

    IM-B318 Panel Discussion: System Center Operations Manager - Tuesday, April 9 10:15 AM - 11:30 AM Mandalay Bay Ballroom L

    AM-B302 Developers and Operations Engineers: System Center and Visual Studio - Wednesday, April 10 12:00 PM - 1:15 PM South Seas F

    AM-B306 DevOps: Azure Monitoring & Authoring Updates for Operations Manager 2012 SP1 - Thursday, April 11 2:45 PM - 4:00 PM Jasmine E

  • musc@> $daniele.work.ToString()

    Operations Manager 2012 SP1 BETA is out, and some cool things you might not (yet) know about it

    • 4 Comments

    It has been a couple of months since we released the CTP2 release (I had blogged about that here http://www.muscetta.com/2012/06/16/operations-manager-2012-sp1-ctp2-is-out-and-my-teched-na-talk-mgt302/ ) and we have now reached the Beta milestone!

    Albeit you might have already seen a number of posts about this last week (i.e. http://blogs.technet.com/b/server-cloud/archive/2012/09/10/system-center-2012-sp1-beta-available-evaluate-with-windows-server-2012.aspx or http://blogs.technet.com/b/momteam/archive/2012/09/11/system-center-2012-service-pack-1-beta-now-available-for-download.aspx), I see the information on the blogs so far didn’t quite explain all the various new features that went into it, and I want to give a better summary specifically about the component that I work on: Operations Manager.

    Keep in mind the below is just my personal summary – the official one is here http://technet.microsoft.com/en-us/library/jj656650.aspx – and it actually does explain these things… but since some OpsMgr community reads a lot of blogs, I wanted to highlight some points of this release.

    Platform Support

    • Support for installing the product on Windows Server 2012 for all components: agent, server, databases, etc.
    • Support for using SQL Server 2012 to host the databases

    Cloud Services

    • Global Service Monitor - This is actually something that Beta version enables, but the required MPs don’t currently ship with the Beta download directly - you will be able to sign up for the Beta of GSM here. Once you have registered and imported the new MPs, you will be able to use our cloud based capability to monitor the health of your web applications from geo-distributed perspective that Microsoft manages and runs on Windows Azure, just like you would from your own agent/watcher nodes. Think of it as an extension of your network, or “watcher nodes in the cloud”

    APM-Related improvements

    this is my area and what myself and the team I am in specifically works on – so I personally had the privilege to drive some of this work (not all - some other PMs drove some of this too!)

    • Support for IIS8 with APM (.NET application performance monitoring) – this enables APM to monitor applications running on Windows Server 2012, not just 2008 anymore. The new Windows Server 2012 and IIS8 Management packs are required for this to work. Please note that, if you have imported the previous, “Beta” Windows 8 Management packs, they will need to be removed prior to installing the official Windows Server 2012 Management Packs. About Windows Server 2012 support and MPs, read more here http://blogs.technet.com/b/momteam/archive/2012/09/05/windows-server-2012-system-center-operations-manager-support.aspx
    • Monitoring of WCF, ASP.NET MVC and .NET NT services – we made changes to the agent so that we better understand and present data related to calls to WCF Services, we support monitoring of ASP.NET MVC applications, and we enabled monitoring of Windows Services that are built on the .NET framework – the APM documentation here is updated in regards to these changes and refers to both 2012 RTM and SP1 (pointing out the differences, when needed) http://technet.microsoft.com/en-us/library/hh457578.aspx
    • Introduction of Azure SDK support – this means you can monitor applications that make use of Azure Storage with APM, and the agent is now aware of Azure tables, blobs, queues as SQL Azure calls. It essentially means that APM events will tell you things like “your app was slow when copying that azure blob” or “you got an access denied when writing to that table”
    • 360 .NET Application Monitoring Dashboards – this brings together different perspectives of application health in one place: it displays information from Global Service Monitor, .NET Application Performance Monitoring, and Web Application Availability Monitoring to provide a summary of health and key metrics for 3-tier applications in a single view. Documentation here http://technet.microsoft.com/en-us/library/jj614613.aspx
    • Monitoring of SharePoint 2010 with APM (.NET application performance monitoring) - this was a very common ask from the customers and field, and some folks were trying to come up with manual configurations to enable it (i.e. http://blogs.technet.com/b/shawngibbs/archive/2012/03/01/system-center-2012-operation-manager-apm.aspx ) but now this comes out of the box and it is, in fact, better than what you could configure: we had to change some of the agent code, not just configuration, to deal with some intricacies of Sharepoint…
    • Integration with Team Foundation Server 2010 and Team Foundation Server 2012 - functionality has also been enhanced in comparison to the previous TFS Synchronization management pack (which was shipped out of band, now it is part of Operations Manager). It allows Operations teams to forward APM alerts ( http://blogs.technet.com/b/momteam/archive/2012/01/23/custom-apm-rules-for-granular-alerting.aspx ) to Developers in the form of TFS Work Items, for things that operations teams might not be able to address (i.e. exceptions or performance events that could require fixes/code changes)
    • Conversion of Application Performance Monitoring events to IntelliTrace format – this enables developers to get information about exceptions from their applications in a format that can be natively used in Visual Studio. Documentation for this feature is not yet available, and it will likely appear as we approach the final release of the Service Pack 1. This is another great integration point between Operations and Development teams and tools, contributing to our DevOps story (my personal take on which was the subject of an earlier post of mine: http://www.muscetta.com/2012/02/05/apm-in-opsmgr-2012-for-dev-and-for-ops/)

    Unix/Linux Improvements

    Audit Collection Services

    • Support for Dynamic Access Control in Windows Server 2012 - When was the last time that an update to ACS was made? Seems like a long time ago to me…. Windows Server 2012 enhances the existing Windows ACL model to support Dynamic Access Control. System Center 2012 Service Pack 1 (SP1) contributes to the fulfilling these scenarios by providing enterprise-wide visibility into the use of the Dynamic Access Control.

    Network Monitoring

    • Additional network devices models supported – new models have been tested and added to the supported list
    • Visibility into virtual network switches in vicinity dashboard – this requires integration with Virtual Machine Manager to discover the network switches exposed by the hypervisor

     

     

    Reminders:

    • Production use is NOT supported for customers who are not part of the TAP program
    • Upgrade from CTP2 to Beta is NOT Supported
    • Upgrade from 2012 RTM to SP1 Beta will ONLY be supported for customers participating in the TAP Program
    • Procedures not covered in the documentation might not work

     

     

     

    Download http://www.microsoft.com/en-us/download/details.aspx?id=34607

  • musc@> $daniele.work.ToString()

    Operations Manager 2012 SP1 CTP2 is out, and my TechED NA talk (MGT302)

    • 0 Comments

    As you might have already heard, this has been an amazing week at TechEd North America: System Center 2012 has been voted as the Best Microsoft Product at TechEd, and we have released the Community Technology Preview (CTP2) of all System Center 2012 SP1 components.

    I wrote a (quick) list of the changes in Operations Manager CTP2 in this other blog post and many of those are related to APM (formerly AVIcode technology). I have also demoed some of these changes in my session on thursday – you can watch the recording here. I think one of the most-awaited change is support for monitoring Windows Services written in .NET – but there is more than that!

    In the talk I also covered a bit of Java monitoring (which is the same as in 2012, no changes in SP1) and my colleague  Åke Pettersson talked about Synthetic Transactions, and how to bring all together (synthetic and APM) in a single new dashboard (also shipping in SP1 CTP2) that gives you a 360 degrees view of your applications. The CTP2 documentation covers both the changes to APM as well as how to light up this new dashboard.

    When it comes to synthetics  – I know you have been using them from your own agents/watcher nodes – but to have a complete picture from the outside in (or last mile), we have now also announced the Beta of Global Service Monitoring (it was even featured in the Keynote!) – where essentially we extend your OpsMgr infrastructure to the cloud, and allow you to upload your tests to our Azure-based service and we will run those tests against your Internet-facing applications from our watcher nodes in various datacenters around the globe and feed back the data to your OpsMgr infrastructure, so that you can see how your application is available and responding from those locations. You can sign up for the consumer preview of GSM from the connect site.

    Enjoy your beta testing! (Isn’t that what weekends are for, geeks?)

  • musc@> $daniele.work.ToString()

    Boris’s OpsMgr Tools – Updated

    • 63 Comments

    Over the years, Boris has released a set of phenomenal tools, that saved lives of OpsMgr administrators quite some time in performing common OpsMgr tasks in OpsMgr 2007 and 2007 R2..

    The sad news is that Boris has moved to another team within Microsoft. He has made a tremendous contribution over the years to the OpsMgr product, and I am sure he will rock on into his new role and team. At the same time he will be missed.

    In order to not let those tools go to waste, since I know many people use them, I have asked him to give me the code of his tools and allow me to update and maintain those tools going forward. And so I did: I updated a couple of his tools to work with OpsMgr 2012:

     

    Tool Description
    MPViewer 2.3.3 The previous version 1.7 (that works with OpsMgr 2007 and 2007 R2) was released here. Version 2.3.3 has been updated to work with OpsMgr 2012, and now includes support for MPB files (MP Bundles), shows embedded resources in bundles (such as images or scripts), loads MPs asynchronously, and has the ability to Unseal and Unpack MP Bundles.
    OverrideExplorer 3.7 The previous version 3.3 (that works with OpsMgr 2007 and 2007 R2) was released here. Version 3.7 has been updated to work with OpsMgr2012 and includes some minor fixes, as well as the capability to Export all overrides to an Excel spreadsheet. It also now shows both Windows and Unix computers in the computers view.
    Proxy Settings 1.2 The previous version 1.1 (that works with OpsMgr 2007 and 2007 R2) was released here. Version 1.2 is functionally identical to the previous version but has been just recompiled to work with OpsMgr 2012 SDK.

    OverrideCreator 1.5

    The previous version (that works with OpsMgr 2007 and 2007 R2) was released here. Version 1.5 is functionally identical to the previous version but has been just recompiled to work with OpsMgr 2012 SDK.
                     

    All the above tools require the Operations Manager Console being installed on the machine where you run them, as well as the .NET framework 4.0.

    According to my information, the above four tools were the most used/useful. Feel free to comment if need any other one being updated and/or have bug reports or feature requests – albeit I don’t promise I will be able to fix or update everything Smile

    Disclaimer

    Just like their predecessors, it is necessary to make clear that this posting is provided "AS IS" with no warranties, and confers no rights.
    Use of included utilities are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm

     

    Changelog / Updates

    [Updated on March 8th 2012 with MPViewer 1.9.1 that contains a fix for the Excel export of some MPs]

    [Updated on March 15th 2012 with MPViewer 2.0 that now allows you to Unseal/Unpack MPs and MPBundles]

    [Updated on March 21st 2012 with OverrideExplorer 3.5 which now allows to export Overrides to Excel]

    [Updated on July 19th 2012 with MPViewer 2.1 that now shows the PublicKeyToken for referernces/dependencies]

    [Updated on August 29th 2012 with MPViewer 2.1.2 that contains fixes to show Perf Objects, Counters and Frequency for some more modules]

    [Updated on September 29th 2012 with MPViewer 2.2 that contains cosmetic as well as reliability/responsiveness fixes]

    [Updated on October 3rd 2012 with MPViewer 2.2.1 that contains a fix for a crash when opening Unsealed MPs]

    [Updated on November 20th 2012 with OverrideExplorer 3.6 that contains a fix for the “change target” operation that was creating broken overrides when changing target from a group to another group]

    [Updated on April 26th 2013 with MPViewer 2.2.2 that contains a fix for some rules in the IIS MP that were incorrectly being reported as not generating alerts, and another fix for the "unseal/unbundle" menu item that sometimes was not being enabled]

    [Updated on May 9th 2013 with MPViewer 2.3 that now can also handle MP Bundles that contain multiple ManagementPacks in a single bundle]

    [Updated on May 14th 2013 with OverrideCreator 1.5 – first working version for OpsMgr 2012]

    [Updated on November 23rd 2013 with OverrideExplorer 3.7 - now includes Unix computers in the computers view]

    [Updated on February 17th 2014 with MPViewer 2.3.2 - now shows (most) event ID's and Event Sources for Event Rules]

    [Updated on March 21st 2014 with MPViewer 2.3.3 - now allows both HTML and XLS export in bulk thru command line - more info in the comment thread below]

  • musc@> $daniele.work.ToString()

    A couple of OpsMgr / APM Posts

    • 0 Comments

    Just some shameless personal plug here, pointing out that I recently wrote two technical posts on the momteam blog about the APM feature in Operations Manager 2012 – maybe you want to check them out:

    1. APM object model – describes the object model that gets created by the APM Template/Wizard when you configure .NET application monitoring
    2. Custom APM Rules for Granular Alerting – explains how you can leverage management pack authoring techniques to create alerting rules with super-granular criteria’s (building beyond what the GUI would let you do)

    Hope you find them useful – if you are one of my “OpsMgr readers” Smile

  • musc@> $daniele.work.ToString()

    Operations Manager 2012 Release Candidate is out of the bag!

    • 0 Comments

    Go read the announcement at http://blogs.technet.com/b/server-cloud/archive/2011/11/10/system-center-operations-manager-2012-release-candidate-from-the-datacenter-to-the-cloud.aspx

    This is the first public release since I am part of the team (I started in this role the day after the team had shipped Beta) and this is the first release that contains some direct output of my work. It feels so good!

    Documentation has also been refreshed – it starts here http://technet.microsoft.com/en-us/library/hh205987.aspx

    The part specifically about the APM feature is here http://technet.microsoft.com/en-us/library/hh457578.aspx

    Enjoy!

  • musc@> $daniele.work.ToString()

    Repost: Useful SetSPN tips

    • 0 Comments

    I just saw that my former colleague (PFE) Tristan has posted an interesting note about the use of SetSPN “–A” vs SetSPN “–S”. I normally don’t repost other people’s content, but I thought this would be useful as there are a few SPN used in OpsMgr and it is not always easy to get them all right… and you can find a few tricks I was not aware of, by reading his post.

    Check out the original post at http://blogs.technet.com/b/tristank/archive/2011/10/10/psa-you-really-need-to-update-your-kerberos-setup-documentation.aspx

  • musc@> $daniele.work.ToString()

    A month in a new life

    • 0 Comments

    Hey, I have just realized that I have been in my new PM role for a month already – time flies!

    If you are one of my OpsMgr readers, in case you haven’t noticed, I have been silent here but I have published a post on the momteam blog – check it out: http://blogs.technet.com/b/momteam/archive/2011/08/12/application-performance-monitoring-in-opsmgr-2012-beta.aspx

    If you are one of those few readers interested in following what I do, instead – I can tell you that I am loving the new job. Lot to do, of course, and that also applies to  the private sphere – did you know that relocating to another continent takes some energy and effort? - but we are settling in nicely and things are going very smooth overall.

  • musc@> $daniele.work.ToString()

    I have been chosen; Farewell my friends...

    • 1 Comments

    I have been in Premier Field Engineering for nearly 7 years (it was not even called PFE when I joined - it was just "another type of support"...) and I have to admit that it has been a fun, fun ride: I worked with awesome people and managed to make a difference with our products and services for many customers - directly working with some of those customers, as well as indirectly thru the OpsMgr Health Check program - the service I led for the last 3+ years, which nowadays gets delivered hundreds of times a year around the globe by my other fellow PFEs.

    But it is time to move on: I have decided to go thru a big life change for me and my family, and I won't be working as a Premier Field Engineer anymore as of next week.

    But don't panic - I am staying at Microsoft!

    I have actually never been closer to Microsoft than now: we are packing and moving to Seattle the coming weekend, and on July 18th I will start working as a Program Manager in the Operations Manager product team, in Redmond. I am hoping this will enable me to make a difference with even more customers.

    Exciting times ahead - wish me luck!

     

    That said – PFE is hiring! If you are interested in working for Microsoft – we have open positions (including my vacant position in Italy) for almost all the Microsoft technologies. Simply visit http://careers.microsoft.com and search on “PFE”.

    As for the OpsMgr Health Check, don't you worry: it will continue being improved - I left it in the hands of some capable colleagues: Bruno Gabrielli, Stefan Stranger and Tim McFadden - and they have a plan and commitment to update it to OpsMgr 2012.

  • musc@> $daniele.work.ToString()

    Improved ACS Partitions Query

    • 0 Comments

    This has been sitting on my hard drive for a long time. Long story short, the report I posted at Permanent Link to Audit Collection Services Database Partitions Size Report had a couple of bugs:

    1. it did not consider the size of the dtString_XXX tables but only the size of dtEvent_XXX tables – this would still give you an idea of the trends, but it could lead to quite different SIZE calculations
    2. the query was failing on some instances that have been installed with the wrong (unsupported) Collation settings.

    I fixed both bugs, but I don’t have a machine with SQL 2005 and Visual Studio 2005 anymore… so I can’t rebuild my report – but I don’t want to distribute one that only works on SQL 2008 because I know that SQL2005 is still out there. This is partially the reason that held this post back.

    Without waiting so much longer, therefore, I decided I’ll just give you the fixed query. Enjoy Smile

     

    --Query to get the Partition Table
    --for each partition we launch the sp_spaceused stored procedure to determine the size and other info
    
    --partition list
    select PartitionId,Status,PartitionStartTime,PartitionCloseTime 
    into #t1
    from dbo.dtPartition with (nolock)
    order by PartitionStartTime Desc 
    
    
    --sp_spaceused holder table for dtEvent
    create table #t2 (
        PartitionId nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
        rows nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
        reserved nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
        data nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
        index_size nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
        unused nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS    
    )
    
    --sp_spaceused holder table for dtString
    create table #t3 (
        PartitionId nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
        rows nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
        reserved nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
        data nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
        index_size nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
        unused nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS    
    )
    
    
    set nocount on
    
    --vars used for building Partition GUID and main table name
    declare @partGUID nvarchar(MAX)
    declare @tblName nvarchar(MAX)
    declare @tblNameComplete nvarchar(MAX)
    declare @schema nvarchar(MAX)
    DECLARE @vQuery NVARCHAR(MAX)
    
    --cursor
    declare c cursor for 
        select PartitionID from #t1
    open c
    fetch next from c into @partGUID
    
    --start cursor usage
    while @@FETCH_STATUS = 0
    begin
    
    --tblName - first usage for dtEvent
    set @tblName = 'dtEvent_' + @partGUID
    
    --retrieve the schema name
    SET @vQuery = 'SELECT @dbschema = TABLE_SCHEMA from INFORMATION_SCHEMA.tables where TABLE_NAME = ''' + @tblName + ''''
    EXEC sp_executesql @vQuery,N'@dbschema nvarchar(max) out, @dbtblName nvarchar(max)',@schema out, @tblname
    
    --tblNameComplete
    set @tblNameComplete = @schema + '.' + @tblName
    
    INSERT #t2 
        EXEC sp_spaceused @tblNameComplete
    
    
        
        
        
    --tblName - second usage for dtString
    set @tblName = 'dtString_' + @partGUID
    
    --retrieve the schema name
    SET @vQuery = 'SELECT @dbschema = TABLE_SCHEMA from INFORMATION_SCHEMA.tables where TABLE_NAME = ''' + @tblName + ''''
    EXEC sp_executesql @vQuery,N'@dbschema nvarchar(max) out, @dbtblName nvarchar(max)',@schema out, @tblname
    
    --tblNameComplete
    set @tblNameComplete = @schema + '.' + @tblName
    
    INSERT #t3 
        EXEC sp_spaceused @tblNameComplete
    
        
        
        
    fetch next from c into @partGUID
    end
    close c
    deallocate c
    
    
    --select * from #t2
    --select * from #t3
    
    
    --results
    select #t1.PartitionId, 
        #t1.Status, 
        #t1.PartitionStartTime, 
        #t1.PartitionCloseTime, 
        #t2.rows,
        (CAST(LEFT(#t2.reserved,LEN(#t2.reserved)-3) AS NUMERIC(18,0)) + CAST(LEFT(#t2.reserved,LEN(#t2.reserved)-3) AS NUMERIC(18,0))) as 'reservedKB', 
        (CAST(LEFT(#t2.data,LEN(#t2.data)-3) AS NUMERIC(18,0)) + CAST(LEFT(#t3.data,LEN(#t3.data)-3) AS NUMERIC(18,0)))as 'dataKB', 
        (CAST(LEFT(#t2.index_size,LEN(#t2.index_size)-3) AS NUMERIC(18,0)) + CAST(LEFT(#t3.index_size,LEN(#t3.index_size)-3) AS NUMERIC(18,0))) as 'indexKB', 
        (CAST(LEFT(#t2.unused,LEN(#t2.unused)-3) AS NUMERIC(18,0)) + CAST(LEFT(#t3.unused,LEN(#t3.unused)-3) AS NUMERIC(18,0))) as 'unusedKB'
    from #t1
    join #t2
    on #t2.PartitionId = ('dtEvent_' + #t1.PartitionId)
    join #t3
    on #t3.PartitionId = ('dtString_' + #t1.PartitionId)
    order by PartitionStartTime desc
    
    
    
    --cleanup
    drop table #t1
    drop table #t2
    drop table #t3
  • musc@> $daniele.work.ToString()

    OpsMgr Agents and Gateways Failover Queries

    • 0 Comments

    The following article by Jimmy Harper explains very well how to set up agents and gateways’ failover paths thru Powershell http://blogs.technet.com/b/jimmyharper/archive/2010/07/23/powershell-commands-to-configure-gateway-server-agent-failover.aspx . This is the approach I also recommend, and that article is great – I encourage you to check it out if you haven’t done it yet!

    Anyhow, when checking for the actual failover paths that have been configured, the use of Powershell suggested by Jimmy is rather slow – especially if your agent count is high. In the Operations Manager Health Check tool I was also using that technique at the beginning, but eventually moved to the use of SQL queries just for performance reasons. Since then, we have been using these SQL queries quite successfully for about 3 years now.

    But this the season of giving... and I guess SQL Queries can be a gift, right? Therefore I am now donating them as Christmas Gift to the OpsMrg community Smile

    Enjoy – and Merry Christmas!

     

    --GetAgentForWhichServerIsPrimary
    SELECT SourceBME.DisplayName as Agent,TargetBME.DisplayName as Server
    FROM Relationship R WITH (NOLOCK) 
    JOIN BaseManagedEntity SourceBME 
    ON R.SourceEntityID = SourceBME.BaseManagedEntityID 
    JOIN BaseManagedEntity TargetBME 
    ON R.TargetEntityID = TargetBME.BaseManagedEntityID 
    WHERE R.RelationshipTypeId = dbo.fn_ManagedTypeId_MicrosoftSystemCenterHealthServiceCommunication() 
    AND SourceBME.DisplayName not in (select DisplayName 
    from dbo.ManagedEntityGenericView WITH (NOLOCK) 
    where MonitoringClassId in (select ManagedTypeId 
    from dbo.ManagedType WITH (NOLOCK) 
    where TypeName = 'Microsoft.SystemCenter.GatewayManagementServer') 
    and IsDeleted ='0') 
    AND SourceBME.DisplayName not in (select DisplayName from dbo.ManagedEntityGenericView WITH (NOLOCK) 
    where MonitoringClassId in (select ManagedTypeId from dbo.ManagedType WITH (NOLOCK) 
    where TypeName = 'Microsoft.SystemCenter.ManagementServer') 
    and IsDeleted ='0') 
    AND R.IsDeleted = '0'
    
    
    --GetAgentForWhichServerIsFailover
    SELECT SourceBME.DisplayName as Agent,TargetBME.DisplayName as Server
    FROM Relationship R WITH (NOLOCK) 
    JOIN BaseManagedEntity SourceBME 
    ON R.SourceEntityID = SourceBME.BaseManagedEntityID 
    JOIN BaseManagedEntity TargetBME 
    ON R.TargetEntityID = TargetBME.BaseManagedEntityID 
    WHERE R.RelationshipTypeId = dbo.fn_ManagedTypeId_MicrosoftSystemCenterHealthServiceSecondaryCommunication() 
    AND SourceBME.DisplayName not in (select DisplayName 
    from dbo.ManagedEntityGenericView WITH (NOLOCK) 
    where MonitoringClassId in (select ManagedTypeId 
    from dbo.ManagedType WITH (NOLOCK) 
    where TypeName = 'Microsoft.SystemCenter.GatewayManagementServer') 
    and IsDeleted ='0') 
    AND SourceBME.DisplayName not in (select DisplayName 
    from dbo.ManagedEntityGenericView WITH (NOLOCK) 
    where MonitoringClassId in (select ManagedTypeId 
    from dbo.ManagedType WITH (NOLOCK) 
    where TypeName = 'Microsoft.SystemCenter.ManagementServer') 
    and IsDeleted ='0') 
    AND R.IsDeleted = '0'
    
    
    --GetGatewayForWhichServerIsPrimary
    SELECT SourceBME.DisplayName as Gateway, TargetBME.DisplayName as Server
    FROM Relationship R WITH (NOLOCK) 
    JOIN BaseManagedEntity SourceBME 
    ON R.SourceEntityID = SourceBME.BaseManagedEntityID 
    JOIN BaseManagedEntity TargetBME 
    ON R.TargetEntityID = TargetBME.BaseManagedEntityID 
    WHERE R.RelationshipTypeId = dbo.fn_ManagedTypeId_MicrosoftSystemCenterHealthServiceCommunication() 
    AND SourceBME.DisplayName in (select DisplayName 
    from dbo.ManagedEntityGenericView WITH (NOLOCK) 
    where MonitoringClassId in (select ManagedTypeId 
    from dbo.ManagedType WITH (NOLOCK) 
    where TypeName = 'Microsoft.SystemCenter.GatewayManagementServer') 
    and IsDeleted ='0') 
    AND R.IsDeleted = '0'
        
    
    --GetGatewayForWhichServerIsFailover
    SELECT SourceBME.DisplayName As Gateway, TargetBME.DisplayName as Server
    FROM Relationship R WITH (NOLOCK) 
    JOIN BaseManagedEntity SourceBME 
    ON R.SourceEntityID = SourceBME.BaseManagedEntityID 
    JOIN BaseManagedEntity TargetBME 
    ON R.TargetEntityID = TargetBME.BaseManagedEntityID 
    WHERE R.RelationshipTypeId = dbo.fn_ManagedTypeId_MicrosoftSystemCenterHealthServiceSecondaryCommunication() 
    AND SourceBME.DisplayName in (select DisplayName 
    from dbo.ManagedEntityGenericView WITH (NOLOCK) 
    where MonitoringClassId in (select ManagedTypeId 
    from dbo.ManagedType WITH (NOLOCK) 
    where TypeName = 'Microsoft.SystemCenter.GatewayManagementServer') 
    and IsDeleted ='0') 
    AND R.IsDeleted = '0'
    
    
    --xplat agents
    select bme2.DisplayName as XPlatAgent, bme.DisplayName as Server
    from dbo.Relationship r with (nolock) 
    join dbo.RelationshipType rt with (nolock) 
    on r.RelationshipTypeId = rt.RelationshipTypeId 
    join dbo.BasemanagedEntity bme with (nolock) 
    on bme.basemanagedentityid = r.SourceEntityId 
    join dbo.BasemanagedEntity bme2 with (nolock) 
    on r.TargetEntityId = bme2.BaseManagedEntityId 
    where rt.RelationshipTypeName = 'Microsoft.SystemCenter.HealthServiceManagesEntity' 
    and bme.IsDeleted = 0 
    and r.IsDeleted = 0 
    and bme2.basemanagedtypeid in (SELECT DerivedTypeId 
    FROM DerivedManagedTypes with (nolock) 
    WHERE BaseTypeId = (select managedtypeid 
    from managedtype where typename = 'Microsoft.Unix.Computer') 
    and DerivedIsAbstract = 0)
  • musc@> $daniele.work.ToString()

    Got Orphaned OpsMgr Objects?

    • 0 Comments

    Have you ever wondered what would happen if, in Operations Manager, you’d delete a Management Server or Gateway that managed objects (such as network devices) or has agents pointing uniquely to it as their primary server?

    The answer is simple, but not very pleasant: you get ORPHANED objects, which will linger in the database but you won’t be able to “see” or re-assign anymore from the GUI.

    So the first thing I want to share is a query to determine IF you have any of those orphaned agents. Or even if you know, since you are not able to "see" them from the console, you might have to dig their name out of the database. Here's a query I got from a colleague in our reactive support team:


    -- Check for orphaned health services (e.g. agent).
    declare @DiscoverySourceId uniqueidentifier;
    SET @DiscoverySourceId = dbo.fn_DiscoverySourceId_User();
    SELECT TME.[TypedManagedEntityid], HS.PrincipalName
    FROM MTV_HealthService HS
    INNER JOIN dbo.[BaseManagedEntity] BHS WITH(nolock)
        ON BHS.[BaseManagedEntityId] = HS.[BaseManagedEntityId]
    -- get host managed computer instances
    INNER JOIN dbo.[TypedManagedEntity] TME WITH(nolock)
        ON TME.[BaseManagedEntityId] = BHS.[TopLevelHostEntityId]
        AND TME.[IsDeleted] = 0
    INNER JOIN dbo.[DerivedManagedTypes] DMT WITH(nolock)
        ON DMT.[DerivedTypeId] = TME.[ManagedTypeId]
    INNER JOIN dbo.[ManagedType] BT WITH(nolock)
        ON DMT.[BaseTypeId] = BT.[ManagedTypeId]
        AND BT.[TypeName] = N'Microsoft.Windows.Computer'
    -- only with missing primary
    LEFT OUTER JOIN dbo.Relationship HSC WITH(nolock)
        ON HSC.[SourceEntityId] = HS.[BaseManagedEntityId]
        AND HSC.[RelationshipTypeId] = dbo.fn_RelationshipTypeId_HealthServiceCommunication()
        AND HSC.[IsDeleted] = 0
    INNER JOIN DiscoverySourceToTypedManagedEntity DSTME WITH(nolock)
        ON DSTME.[TypedManagedEntityId] = TME.[TypedManagedEntityId]
        AND DSTME.[DiscoverySourceId] = @DiscoverySourceId
    WHERE HS.[IsAgent] = 1
    AND HSC.[RelationshipId] IS NULL;

    Once you have identified the agent you need to re-assign to a new management server, this is doable from the SDK. Below is a powershell script I wrote which will re-assign it to the RMS. It has to run from within the OpsMgr Command Shell.
    You still need to change the logic which chooses which agent - this is meant as a starting base... you could easily expand it into accepting parameters and/or consuming an input text file, or using a different Management Server than the RMS... you get the point.

    1. $mg = (get-managementgroupconnection).managementgroup  
    2. $mrc = Get-RelationshipClass | where {$_.name –like "*Microsoft.SystemCenter.HealthServiceCommunication*"}  
    3. $cmro = new-object Microsoft.EnterpriseManagement.Monitoring.CustomMonitoringRelationshipObject($mrc)  
    4. $rms = (get-rootmanagementserver).HostedHealthService  
    5.  
    6. $deviceclass = $mg.getmonitoringclass(“HealthService”)  
    7. $mc = Get-connector | where {$_.Name –like “*MOM Internal Connector*”}  
    8.    
    9. Foreach ($obj in $mg.GetMonitoringObjects($deviceclass))  
    10. {  
    11.     #the next line should be changed to pick the right agent to re-assign  
    12.     if ($obj.DisplayName -match 'dsxlab')  
    13.     {  
    14.                 Write-host $obj.displayname  
    15.                 $imdd = new-object Microsoft.EnterpriseManagement.ConnectorFramework.IncrementalMonitoringDiscoveryData  
    16.                 $cmro.SetSource($obj)  
    17.                 $cmro.SetTarget($rms)  
    18.                 $imdd.Add($cmro)  
    19.                 $imdd.Commit($mc)  
    20.     }  

     

    Similarly, you might get orphaned network devices. The script below is used to re-assign all Network Devices to the RMS. This script is actually something I have had even before the other one (yes, it has been sitting in my "digital drawer" for a couple of years or more...) and uses the same concept - only you might notice that the relation's source and target are "reversed", since the relationships are different:

    • the Management Server (source) "manages" the Network Device (target)
    • the Agent (source) "talks" to the Management Server (target)

    With a bit of added logic it should be easy to have it work for specific devices.

    1. $mg = (get-managementgroupconnection).managementgroup  
    2.  
    3. $mrc = Get-RelationshipClass | where {$_.name –like "*Microsoft.SystemCenter.HealthServiceShouldManageEntity*"}  
    4.  
    5. $cmro = new-object Microsoft.EnterpriseManagement.Monitoring.CustomMonitoringRelationshipObject($mrc)  
    6. $rms = (get-rootmanagementserver).HostedHealthService  
    7.  
    8. $deviceclass = $mg.getmonitoringclass(“NetworkDevice”)  
    9.  
    10. Foreach ($obj in $mg.GetMonitoringObjects($deviceclass))  
    11.  
    12. {  
    13.                 Write-host $obj.displayname  
    14.                 $imdd = new-object Microsoft.EnterpriseManagement.ConnectorFramework.IncrementalMonitoringDiscoveryData  
    15.                 $cmro.SetSource($rms)  
    16.                 $cmro.SetTarget($obj)  
    17.                 $imdd.Add($cmro)  
    18.  
    19.                 $mc = Get-connector | where {$_.Name –like “*MOM Internal Connector*”}  
    20.  
    21.                 $imdd.Commit($mc)  

     

    Disclaimer

    The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided "AS IS" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.

Page 1 of 3 (60 items) 123