Let me start by stating up front – this issue was not caused by the upgrade of my lab environment from 2012 to 2012 R2. It occurred soon after upgrade to 2012 R2, but the blog post below shows what issues I was seeing and the debug steps taken. The debug steps are the same ones we would have done in Operations Manager 2012 but they have changed slightly with the new version of Operations Manager so I decided to publish this blog post showing how these procedures have changed in Operations Manager 2012 R2 since these changes slowed me down in my debug steps. The differences in 2012 R2 versus 2012 are identified in blue below.
In my environment, the only server which was reporting correctly to Operations Manager 2012 R2 was the management server as shown below.
I took the following debug steps documented in this blog post:
- Rebuilding the Health Service State folder
- Changing the automatically approved agent setting and working with the agent
- Fixing my mistake on the Operations Manager server
Rebuilding the Health Service State folder in Operations Manager 2012 R2
My first thought was that there were issues with the management server as a result of the upgrade. To get to a clean state, I decided to rebuild the health service state folder.
The process to do this is to stop each of the three services:
- Microsoft Monitoring Agent
- System Center Data Access Service
- System Center Management Configuration
These service name changes have also been blogged by at Stefan Roth at http://blog.scomfaq.ch/2013/07/01/quick-post-scom-2012-r2-new-windows-service-names/.
The health service folder has been moved in 2012 R2. It now resides at: C:\Program Files\Microsoft System Center 2012 R2\Operations Manager\Server\Health Service State
The process to rebuild this folder is to stop each of the services (ok, only one or two may be required but I’ve always just done it this way) rename the Health Service State (I usually call it Health Service State_old) and restart the three services. The folder will then be rebuilt.
Changing automatically approved agent setting and working with the agent:
Upon investigation I found that connections were being closing immediately on the servers being monitored by Operations Manager. (See the error 20070 and 21016 in the Operations Manager event log)
Security on Operations Manager server was set to automatically approve…
In the past I have seen situations where a setting will get stuck and need to be manually reset. I changed the setting reject new manual agent installations, stop and started the service on the agent computer. After making this change, I wanted to restart the agent on the server which was not reporting correctly to Operations Manager. The service name was changed (as shown earlier in this blog post) to “Microsoft Monitoring Agent”.
To validate that the agent was installed correctly I attempted to open the Operations Manager Agent object in the control panel but it was not appearing any longer (see the search for Operations in the top right of the control panel shown below).
This is because now the control panel object is called “Microsoft Monitoring Agent” (logically enough).
I validated the configuration of the agent was correct and checked the same configuration on another agent to verify that the problem was occurring on multiple servers which were monitored by Operations Manager – pointing to an issue on the server itself not the agents reporting the issue.
Fixing my mistake on the Operations Manager server
What I found was that my Operations Manager server did not have a correct IP address and DNS configuration. I had rebuilt the server from the VHD file and had not remembered to hard-code the IP address and DNS server configuration. The result was that this server could not communicate correctly with the domain and was having strange issues like this as a result (oops!). I hard-coded the IP address and set the DNS server configuration correctly.
My Operations Manager 2012 R2 environment looked much better after the DNS change was in place and the Operations Manager server had been rebooted as shown below.
Summary: My issue tracked back to a misconfigured Operations Manager server’s network settings but it prove to be an interesting episode in debugging which showcased several of the things that are different with Operations Manager 2012 R2 and pointed out how an environment will respond if DNS / domain membership is impacted on an Operations Manager server.