Disassembled

In this blog post, we will disassemble the alert structure from Azure for metrics and logs, compare what is included in the alert, and point out challenges in the alert functionality currently available in Azure. In the first part of this blog series, we introduced the new dynamic threshold functionality available in Azure’s monitor.

So, what do the alerts look like?

The answer is that it varies widely based on what type of an alert it is (metric or log analytics based).

Metric-based alert format:

Below is a sample alert format based on what I have seen for this CPU alert for a metric type of alert.

Subject format:

  • <Fired or Resolved>:Sev<# for severity> Azure Monitor Alert <Alert Name> on <Resource name> ( <Resource type> ) at <date time>

Subject samples:

  • Fired:Sev3 Azure Monitor Alert CPU Utilization Alert on ln-tctester-03 ( microsoft.compute/virtualmachines ) at 3/27/2022 3:19:40 AM
  • Resolved:Sev3 Azure Monitor Alert CPU Utilization Alert on ln-tctester-03 ( microsoft.compute/virtualmachines ) at 3/27/2022 4:58:14 AM

Body sample: (items in bold below are fields that do not exist in Log Analytics based alerts and are moved to the bottom for readability)

  • <URL link to the alert in Azure Monitor>
  • Alert name                             <Name of the alert>
  • Severity                                 Sev3
  • Monitor condition                   Fired
  • Affected resource                   <VM name>
  • Resource type                        compute/virtualmachines
  • Resource group                      <RG name>
  • Subscription                           <Subscription>
  • Monitoring service                  Platform
  • Signal type                              Metric
  • Fired time                                <Date/Time fired>
  • Alert ID                                    <GUID alert ID>
  • Alert rule ID                             <URL link to the alert ID>
  • Time aggregation                     Average
  • Metric value (when alert fired) 76475
  • Alert sensitivity                         High
  • Operator                                   LowerThan
  • Threshold                                02253553515001
  • Number of violations               4
  • Number of examined periods  4
  • Metric alert condition type   DynamicThresholdCriteria
  • Metric name                           Percentage CPU
  • Metric namespace                 Compute/virtualMachines

Microsoft has effectively provided all of the potentially relevant content in the email. This is logical as these emails may be sent to ticketing systems, so any relevant fields should be included.

For a metric, the information is pretty much there for what you need to know about an alert condition.

From a usability perspective, the content of the alert has the relevant information, including a link to the alert in Azure monitor, a link to the alert rule, the metric’s value, and how many violations have occurred versus how many periods the alert was examined over.

Log Analytics-based alert format:

Now let’s look at what happens if you decide to use Log Analytics as your data source for an alert. Below is a sample alert format based on what I have seen for this CPU alert for a log analytics type of alert. The subject format is the same as shown below.

Subject format:

  • <Fired or Resolved>:Sev<# for severity> Azure Monitor Alert <Alert Name> on <Resource name> ( <Resource type> ) at <date time>

Subject samples:

  • Fired:Sev3 Azure Monitor Alert CPU Utilization Alert on ln-tctester-03 ( microsoft.compute/virtualmachines ) at 3/27/2022 3:19:40 AM

Body sample: (items below in bold are fields that do not exist for Metric based alerts and are moved to the bottom for readability)

  • <URL link to the alert in Azure Monitor>
  • Alert name                             <Name of the alert>
  • Severity                                 Sev3
  • Monitor condition                   Fired
  • Affected resource                  <VM name>
  • Resource type                       compute/virtualmachines
  • Resource group                     <RG name>
  • Subscription                           <Subscription>
  • Monitoring service                  Log Alerts V2
  • Signal type                              Log
  • Fired time                               <Date/Time fired>
  • Alert ID                                   GUID alert ID>
  • Alert rule ID                             <URL link to the alert ID>
  • Time aggregation                    Count
  • Operator                                  GreaterThan
  • Threshold                                0
  • Metric value (when alert fired) 1
  • Number of violations                1
  • Number of examined periods  1
  • Description                            <Alert description>
  • Dimension name1                 _ResourceId
  • Dimension value1                  <Resource Id>
  • Filtered search results          <URL to filtered search results>
  • Search results                        <URL to search results>
  • Search query                          <Kusto query created for the alert>
  • Target resource types           [‘Microsoft.OperationalInsights/workspaces’]

The additional fields are related to the alert description, how query results are split, and include links to search results and the query run as part of the alert. The alert does not know what the query is checking for; it just knows that the query is being run and the result sent back from the query.

NOTE: As an example, for a high CPU condition, the alert may not know how high the CPU is, how long it has been that high, and other relevant pieces of information.

Challenges with Azure monitor alerts:

Alert readability: The first big challenge with these alerts is the volume of information that is contained in the alert. The sheer amount of data makes understanding what is in the alert very challenging. In a later blog post, I will show a simplified version of alerting focused on only providing the required information.

  • Metric alert: The email sent for a metric is three pages long (tested by cutting and pasting into Word)
  • Log Alert: The email sent for a log analytics query-based alert is 4 pages long (this will vary depending on the URL length and query length)

Alert Customization: Customization of alerts is not available for either metrics or logs. IE: You cannot suppress specific fields from an alert, add fields to an alert or change the default structure of an alert.

 

Summary: This blog post provides the internals of what is included in alerts for metrics and logs and highlights the differences between them. Additionally, this brought up two significant challenges I have seen working with them (readability and customization). The next blog post will show how to create custom alert formats for log analytics alerts.