Azure Monitor Alerting with Powershell

In my opinion, part of monitoring with Azure Monitor is being able to alert at scale. When we say alerting at scale this means a few things to me. First, being able to create alerts in a programmatic fashion. Second, it means creating alerts that cover multiple resources at once. For instance, creating a CPU alert for all current and future IaaS VMs. Or creating DTU alerts for Azure SQL PaaS across entire resource groups. For alerting at scale with Azure Monitor we have ARM Templates and of course PowerShell. There are four Azure Monitor Powershell modules.

Azure Monitor Powershell Modules

At present, we have four main PowerShell modules.

  • Az.Monitor
  • Az.OperationalInsights
  • Az.AlertsManagement
  • Az.ApplicationInsights

I’m not going to list all the commands available, because there are a lot. The Application Insights and Operational Insights module for Log Analytics are both have commands for creating, modifying and managing their respective workspaces. The modules of interest for this post are Az.Monitor. There is also the OmsIngestionAPI module, which I’ve covered here. There is a sixth module as well. The commands available in Az.AlertsManagement should just be included in the Az.Monitor module. That said the commands in AlertsManagement are largely related to smart groups and action rules.

Action Groups

Because you cannot create an Alert in Azure Monitor via PowerShell without providing an Action Group, we’re going to start with Action Groups. The available commands for Action Groups in the Az.Monitor module are as follows:

  • Get-AzActionGroup
  • New-AzActionGroup
  • New-AzActionGroupReceiver
  • Remove-AzActionGroup
  • Set-AzActionGroup

Thanks to PowerShell’s wonderful verb action command naming, these are pretty clear on what they do. Except for New-AzActionGroup. This command does not do what you think it does. In fact, its not even used in the creation of new Action Groups. When looking at the command, the only required parameter is ActionGroupID. What this command does is create a PowerShell object with the Action Group ID in it.

To create a new Action Group you need to use New-AzActionGroupReceiver and Set-AzActionGroup and your code will look something like this.

$email = New-AzActionGroupReceiver -Name "hasmug" -EmailReceiver -EmailAddress "hasmug@hasmug.com"
Set-AzActionGroup -Name "HASMUG Action Group" -ResourceGroup $rg -ShortName "MUG" -Receiver $email
This example creates a new email receiver “hasmug@hasmug.com” and then creates a new Action Group called HASMUG Action Group. It then adds the email receiver to that action group.
You’ll use New-AzActionGroupReceiver when actually creating alerts. For reasons unknown to me you have to use it. Lets say you’ve already created your Action Group and you just want to create a new alert that utilizes that Action Group. You would first use Get-AzActionGroup as I have done below.
Azure Monitor PowerShell Modules
I have added it to the variable $act and then used New-AzActionGroup with $act.id
$act = Get-AzActionGroup -ResourceGroupName VS_Monitoring -Name email
$action = New-AzActionGroup -ActionGroupId $act.id
The output of both is below.

Azure Monitor PowerShell Modules
Why the alerts wont take the first but they will take the second is pretty annoying to me, the only thing that is different is that the New-AzActionGroup command gave the value a name to go with it.
You can use New-AzActionGroupReceiver to create action groups for any of the different types of actions that are available.

Azure Monitor Metrics

In my opinion most alerts should be some sort of metric based alert. Don’t get me wrong there are numerous things you can do with log search based alerts, and often may need to. But when it comes to Azure Resources there is a lot Azure Monitor Metrics can cover for alerts from Storage Accounts to Azure SQL PaaS and IaaS VMs. I prefer to use Metric alerts where I can because the alerts are stateful within Azure Monitor. Meaning Azure Monitor remembers the last state that that metric was in. Whereas log search alerts will continue to keep alerting on the same issue each time that log is ingested with the same threshold breach. So for instance, if you setup a CPU alert for a VM and that VM breached the threshold you set for 30 minutes, a metric alert is going to alert you one time, and a log search is going to alert you at whatever collection interval you have set for performance counters.

Get Resource Specific Metrics

You can get any currently available metric by using Get-AzMetricDefinition, you will also need the Resource ID of a specific resource you want the Metrics for. Unfortunately I am not aware of a way to get available metrics for a specific Azure service without having an existing resource. You can however view all the supported metrics based on Azure services here.

So for instance if you wanted to see all the available metrics for a Log Analytics workspace you could perform the following commands. Using Get-AzOperationalInsightsWorkspace and Get-AzMetricDefinition.

$workspace = Get-AzOperationalInsightsWorkspace -Name VS-Sandlot -ResourceGroupName VS_Monitoring


Get-AzMetricDefinition -ResourceId $workspace.resourceid
First note, pay close attention to the values. These are whats used in both Powershell and ARM Templates. For instance in Log Analytics when using Windows VMs one of the available RAM counters is “% Committed Bytes In Use”. But the actual value you need to use is “Average_% Committed Bytes In Use”.
Second note, if you want to get live metrics off of a resource you can use Get-AzMetric. This command has a lot of parameter options to collect metrics with.

Create Metric Alerts with Powershell

To create a Metric Alert we need quite a few things. First for the alert itself, we need:

  • Resource Group Name
  • Window Size
  • Frequency
  • Target ResourceId
  • Condition
  • Action Group
  • Severity
  • Dimension Selection
  • Time Aggregation
  • Operator
  • Threshold

Second, we need 3 Powershell commands. We need two to create our Alert criteria and dimensions and one that creates the actual alert. To create the criteria we use New-AzMetricAlertRuleV2Criteria. Dimension selection is not a required parameter for the criteria command, however, there are instances where not defining a dimension will give us very generic alert results. So I use New-AzMetricAlertRuleV2DimensionSelection especially when setting Alerts against a Log Analytics workspace.


#set dimensions of Alert to Computer. This will alert on all current and future computer members of the workspace
$dim = New-AzMetricAlertRuleV2DimensionSelection -DimensionName "Computer" -ValuesToInclude "*"

#set alert criteria and counter % Processor Time
$criteria = New-AzMetricAlertRuleV2Criteria -MetricName "Average_% Processor Time" `
-DimensionSelection $dim `
-TimeAggregation average `
-Operator GreaterThanorEqual `
-Threshold 90

This code sets us up a dimension using Computer Name, sets our criteria for % Processor Time, average time aggregation, Greater than or equal as the operator and our threshold is set at 90. Notice $dim for dimension has been passed into $criteria as a parameter for the criteria command.

Finally, we’ll use Add-AzMetricAlertRuleV2 to create the alert.

Add-AzMetricAlertRuleV2 -Name "Windows and Linux CPU Alert" `
    -ResourceGroupName $RGObject.ResourceGroupName `
    -WindowSize 00:05:00 `
    -Frequency 00:01:00 `
    -TargetResourceId $ResourceId `
    -Condition $criteria `
    -ActionGroup $action `
    -Severity $severity

This command creates the alert in the resource group you specified, against the ResourceID you specify with the Action Group we got earlier in the post with our criteria, severity, window size, and time grain we specify.

 

Summary

When we start looking into automating alerts there are a number of options available between Powershell and ARM templates. This post has gone over just a few of the key things you need when using Powershell to automate alerts. That said when you start needing to create 5 or 10 alerts for IaaS VMs, and then more alerts for Azure SQL and alerts for Azure Functions or App Services, your now looking at potentially dozens or hundreds of alerts. The need to start automating that just to save time becomes a lot more apparent.

 

The follow up post for IaaS can be found here https://www.systemcenterautomation.com/2020/01/azure-monitor-alerting-at-scale-iaas/