Using Azure Monitor to provide availability of systems works extremely well for most configurations, but what about situations where you can’t install a Log Analytics agent on the system? (whether the OS is not supported, or for a router as an example where it’s not possible to install the agent). For these use cases, we have found it useful to provide a ping level monitor for these types of systems. This blog post will provide details on the solution which we have developed which provides a ping level monitor for an extremely low cost on a monthly basis.
The architecture that we are using for this solution runs in Azure Automation using a watcher node and it consists of three runbooks:
- PingMonitor-Watcher.ps1: This script runs every 60 seconds to check and see if any ping tests fail based on the criteria defined in the “PingMonitorDevices” variable (which contains JSON content populated by the Ping-Monitor-Updater.ps1 script).
- PingMonitor-Updater.ps1: This script automates the population of the “PingMonitorDevices” variable to do items like adding or deleting items to be tested via the PingMonitor-Watcher.ps1 script.
- PingMonitor-Action.ps1: This script activates when there is a failure to ping of one of the systems defined in the “PingMonitorDevices” variable.
This solution also requires one or more Hybrid Runbook workers where the watcher and action scripts will execute.
Installing the solution:
Pre-requisites: This solution assumes that you already have the following:
- An Azure subscription
- A resource group where Azure Automation is stored
- One or more Azure Hybrid Runbook workers
Adding the runbooks:
Once we have our Azure Automation environment, we can easily create the three required scripts by creating each of the three runbooks as the PowerShell runbook type with the names defined above (PingMonitor-Watcher, PingMonitor-Updater, PingMonitor-Action). These scripts are available for download here. Once these have been added you can save and publish them. After they are created, they should look like the screenshot below:
Create the three following variables with their appropriate content (PingMonitorDevices, PingMonitorWorkspaceId, PingMonitorWorkspaceKey):
- PingMonitorDevices – String which is created as blank, but populated with the correct JSON content by the PingMonitor-Updater.ps1 script (create this as NOT encrypted)
- PingMonitorWorkspaceId – String which contains the Log Analytics Workspace ID (create this as encrypted).
- PingMonitorWorkspaceKey – String which contains the Log Analytics Workspace Key (create this as encrypted).
Once created these should look like the following:
Populating the PingMonitorDevices variable:
The PingMonitor-Updater script populates the information required to ping the various systems. We run this script and provide the following information:
- Device: Name of the device which will attempt to be pinged, required string field.
- VariableName: Variable where the content is stored, optional string field – defaults to the value ‘PingMonitorDevices’
- ThresholdMinutes: Variable which defines the threshold for how long the system should not respond to ping, optional int32. Defaults to the value 5.
- SuppressMinutes: Variable which defines how long to suppress alerts before re-sending the alert, optional int32. Defaults to the value 15.
- Delete: Delete the record instead of creating one, optional booleon. Default is “$false”
- TestTime: This performs a one time test of all the connections in your PingMonitorDevices. This allows you to see the execution time and confirm everything is set correctly. Default is “$true”
In my example I used a single value of “TestServer” for the device and took the defaults for the remainder. Below shows the JSON value which the script put into this variable:
Scheduling the watcher task:
Now that all of the scripts and variables are in place we can configure the PingMonitor-Watcher to be run as a watcher task. This is done under process automation / watcher tasks.
We add a watcher task, which I defined as “PingMonitor” with a frequency of 1 and I pointed it to the PingMonitor-Watcher runbook for the watcher and the PingMonitor-Action runbook for the action.
This creates the task and the user experience shows the last watcher status in this same pane.
Additionally, you can dig into the various watcher task runs to see if data is being written such as in the example below:
How does this all come together?
So how does this all work when it’s installed? The watcher task checks every minute for a ping failure. If it does find a ping failure it writes out to the Log Analytics workspace details into the PingMonitor_CL class. If none of the pings fail, it does not write to the Log Analytics workspace. Once this information is logged to Log Analytics we can use Azure Monitor to send an alert whenever a ping failure occurs. Additionally, we can surface this information via dashboards in Azure (both of these topics will be covered in the next blog post).
It is important to note that there is a 30 second maximum for the watcher task to complete. Additionally, as mentioned earlier in this blog post this solution does NOT write data if systems are successfully contacted via ping. We only write data when errors are occurring (IE: systems are offline).
Azure Cost breakdown:
To get the data into Log Analytics this solution uses both Azure Automation runbooks and the worker tasks. The prices on these are below:
- Worker tasks:
- Charged at .002 per hour.
- This would be up to $1.50 a month
- However, the first one is free per month so the worker tasks should be free if this is the only worker task.
- Azure Automation runbooks:
- Charged at .002 per minute
- On average these jobs run in about 1 minute to log the data into Log Analytics.
- However, the first 500 minutes are free so this task should be free if it is the only Azure Automation runbook.
- In general, the cost for this regardless of whether existing worker tasks and Azure Automation runbooks are in place should be less than $5 a month.
There were two other blog posts which existed which are similar in the goal (pinging systems via Azure Monitor/Log Analytics). Their approaches were focused on running scheduled tasks or running the tasks via an Azure Hybrid Runbook worker. This approach is good as well and would be the best choice if you need to have ping statistics (response time, etc) but if the goal is to keep the cost down the approach in this blog post is significantly less expensive to run on a monthly basis.
Summary: If you are looking for a cost-effective method to provide notifications when systems or devices are not responding from Azure Monitor you will want to try this solution out! In the next part of this blog series, Cameron Fuller will show how we can alert from failed ping responses and will show how this data can be showcased in an Azure dashboard.