Welcome this week’s edition of having fun with Operations Manager and Windows Media Center. In the first part of this series we discussed why it is beneficial to create your own class structure. In the second part of this series we discussed a wizard driven method to create your own class structure and discovery. In this blog post we will show processes to gather information which you will need to effectively monitor your application.
This blog post will specifically cover how to identify and use items which are relevant to the application you want to monitor including:
- Building your health model and alerting
- Performance counters
- Event logs
Building your health model and alerting:
One of the biggest strengths to Operations Manager is its health model. Operations Manager’s health model approach can be applied to an object like a server (is it healthy based upon core metrics such as CPU, disk, network and memory). It can also be applied to a smaller object such as a disk (is it online, is it heavily queuing, is it heavily defragmented, does it have sufficient free disk space). Health models can also be applied to larger objects like Active Directory or Exchange based upon the health of the entire application spanning various physical or virtual systems.
I often explain the health model using the graphics below. Our first level health model is a sketch of what we know about the application. It may be rough but it provides the key pieces that you need to know to better understand if the application is or is not healthy. To identify these conditions we look for building blocks which are easy to add in Operations Manager such as required services, important performance counters and relevant application events. This is how we develop the sketch of the application health model.
In Operations Manager the sketch phase looks like what I have shown below which shows us the health explorer for the object created as part of the Media Center Monitoring management pack. Its health is determined by the health of the underlying services and other custom monitors in Operations Manager.
As we learn more about the application we can add more details to the health model over time it comes closer to matching the actual application. This is more like a picture of the application.
Please note that for this health model we are assessing the health of only the application that we are monitoring. We are working from the assumption that existing management packs will handle any underlying dependencies such as the operating system health. We will discuss this in more detail later in this series.
We also need to determine an effective alerting model for our application. Rules and monitors can be used to check for conditions and create alerts. These alerts can then be sent to the subscribers that you define based upon the subscriptions and channels that you create in Operations Manager. For our alerting model it’s key to identify what conditions we will look for (specific events, services, performance counters) and what level of alerting we will provide for those conditions. Key points to consider when designing your alerting model:
- Critical means that you need to take action on the item now. If it’s not important enough to wake you up at 2:00 am when the condition occurs it’s most likely not critical.
- Warning events indicate issues which need to be investigated but may not need to be sent as part of a notification. For this management pack we will send warning alerts for issues which occur (loss of TV signal as an example) we want to send an alert that an issue is occurring.
- Informational events are most often displayed only on the Operations Manager console and not delivered by email. For the Media Center Monitoring management pack we will be making an exception to this rule as the goal of the MP is to email based upon updates which occur in the environment. Examples of these are adding a new recording and deleting a recording. When these conditions occur we want to send an informational alert.
We also need to decide which items we are monitoring are a good choice for a rule versus which items are s a good choice for a monitor. Here’s the key decision factors that I use:
- Should this event indicate that there is an issue on the application which needs to be resolved? IE: Should this impact its health state? And do we have another event which indicates that it is healthy? If so then this should be a monitor. In the items below we have monitors for MCUpdate health and Media Center Extender health.
- Does this alert indicate normal functioning but has conditions which we want to send notifications on? (such as a recording was completed) If so, this should be a rule.
The graphic below shows an example of the alerting model which was used for the application to monitor Windows Media Center:
The easiest building block to identify for our application is what services it is dependent upon. For me the quickest way to identify these is to open services.msc, connect to the remote system and browse the services which are listed. If you are familiar with the common Windows services it’s pretty easy to identify things which are outside of the standard built-in ones. For this example I was looking for keywords such as "Media Center" which I found with the following three services: Media Center Extender Service, Windows Media Center Receiver Service and Windows Media Center Scheduler Service. If you aren’t sure about a service take the full name to the search engine of your choice and it will give you details about it to help to identify if it is or is not part of the application which you want to monitor.
For each of these services find the short name and note that for later (Mcx2Svc in the example below).
Document the services including their full name and short name for later addition to monitoring in Operations Manager.
The next item to look for is custom performance counters for the application you want to monitor. For this example I was identifying if Windows Media Center has its own performance counters. A scan through the performance monitor (On the server itself open perfmon, then add counters, and scan through the counters to identify any specific to your application) did not indicate the existence of any Windows Media Center specific performance counters.
Some quality time spent on my search engine of choice also did not lead to any indications that there are custom performance counters which are added as part of Windows Media Center.
The next big item we are looking for is what events are relevant to your particular application. A scan through the event viewer (Open eventvwr, connect to the remote computer and scan through the various event logs to identify any specific to your application) indicated that Media Center has its own event log which makes it much easier to identify relevant events for this application. If there has not been a separate log file, I would have begun the process to look through the application log for any sources which would indicate that they came from the Media Center.
To identify the initial relevant events for the Windows Media Center application I started by reviewing this log and using filters to remove events which I had already documented in Excel. The trick below shows how you can find what events are being recorded by an application and filter out the ones that you have already documented by adding the exclude filter on the event ID’s (see the -1, -17, -0, -115, -117, -24, -3, -6, -301) in the screenshot below for an example.
Using the Media Center event log made it easy to identify relevant events which had occurred historically. When reviewing this I was looking for the following key pieces of information: Event number, Event source, Event ID (and of course what computer it was being logged by and what event log it was being written to). These pieces of information are what we will need to write our rules or monitors which will be discussed later in this blog series. Two examples of this are shown below with the abrupt disconnection of a Media Center Extender and the removal of a recording.
Event ID: 117
Source: Media Center Extender
Event ID: 21
A complete review of this log with the filtering put in place for events as they are documented resulted in the gathering of relevant events into an Excel spreadsheet to categorize them in terms of monitors or rules, whether they will alert or not and what level of an alert will be generated when this event occurs (as shown below).
Summary: A review of the server specifically focused on your application will give you the basic building blocks that you will need to monitor your application including key events, services and performance counters. By gathering this information we can start the process to build out our health model and alerting model.
In the next part of this blog series we will look at adding rules, monitors and services in Operations Manager for the application that you want to monitor.