Did you know we can now monitor and alert on Key Vault Certificates and their expiration? I’m sure you did, there’s post after post of people showing this with Event Grid. So what’s the point of my post then? Well dear reader if you’ve ever read any of my posts or know anything about me, I do posts on Azure Monitor, Log Analytics and the like. I’m here to tell you, we don’t need Event Grid to do these alerts.
For years now, the ability to monitor and alert on key vault certification expiry has been an extremely common ask. This ability was finally added, per the official docs here Monitoring Key Vault with Azure Event Grid | Microsoft Learn. This uses Even Grid though.
This decision is honestly a little bit perplexing and frustrating as we have an official tool in Azure called, Azure Monitor, for these types of things. In many environments, its generally a best practice to have all monitoring data as centralized as possible, but that’s a topic for another post. One of the things the docs tout is the ability to send the topic from Event Grid to a Logic App, Azure Function or even a Webhook! If you read that and are thinking it sounds an awful lot like what Action Groups can do, you would be correct. From Azure Monitor, we can send alerts to all those services and more. Like sending the alerts directly to your ITIL ticketing management system like Service Now. All without the extra service in the middle I might add.
To alert on Certification expiration all you need is a Log Analytics workspace, in addition to your Key Vault that houses your certificate(s) and secret(s). Yes as a bonus, I will also show you how you can alert on Secret expiration as well.
To monitor Certification and Secret expiration, you’ll need to enable diagnostic settings in your Key Vault(s), if it isn’t enabled already.
Then when importing/generating you’re certificates you have options for life time action type and depending on if you select email at a percentage or days the slider bar changes.
If you want to do this to certificates that have already been generated/imported, you can click on the certificate in your Key Vault and then select Issuance Policy, to bring up a similar context menu.
Setup at Scale
Right now some of you might be thinking, but Billy, I’ve got hundreds or thousands of Key Vaults each with dozens of secrets and Certificates in them, no one can do this by hand. We get you covered. First there is an API you can use, Event Subscriptions – REST API (Azure Event Grid) | Microsoft Learn
If anyone knows if this can be done from ARM/BICEP template, please let me know cause I’d love to include that as well.
But the easier method would be to use Azure Policy. We have a built in policy called “Certificates should have the specified life time action triggers”
This policy will do or set exactly what I just showed above for a single certificate, for all Key Vaults you assign it to. You can set it either expiration in number of days or percentage of lifetime, just like above. I must really applaud the Key Vault PG for this one, often times something like this is an after thought in other Azure services.
When we do anything in Key Vaults, it generates logs. These are some of the Operations that happen, but by no means an exhaustive list.
The two fields that we can use to alert on Certificate expiration and Secret expiration are “CertificateNearExpiryEventGridNotification” and “SecretNearExpiryEventGridNotification”
What is interesting is when you do CertificateImport, you get two fields that show the Cert issue date and expiration date in UTC time. Why this is interesting will be clear in just a few sentences.
Take note of these two fields, certificateProperties_nbf_t and certificateProperties_exp_t. As noted they’re currently in UTC time.
However, when we get events with “CertificateNearExpiryEventGridNotification” those fields are no longer in UTC time.
Those fields are no longer in UTC time. This was quite confusing to me and to my customer whom I was working on this for. I informed our customer that we could still alert on this at scale from Azure Monitor, but he really wanted to know how many days till expiration. So I kept at it. My friend for years and now fellow CSA Jim Reid correctly pointed out that this was the expiration date. However, I still looked at it and said nah, all times in Azure are in UTC. As we’ve been taught and told for years now. So I reached out to the Product Group (PG), who informed me that this is indeed the expiration date, but that it is in Unix time. (*stares blankly*) So I meandered over to our docs and found we actually have an entire scalar function called unixtime_seconds_todatetime().
So I plugged the function into a query and voila.
Alert and Query
With everything set I can now provide both the issue date, expiration date and calculate how many days till expire with datetime_diff().
For Certificate Expiration this is the query I came up with.
AzureDiagnostics | where OperationName =~ 'CertificateNearExpiryEventGridNotification' | extend CertExpire = unixtime_seconds_todatetime(eventGridEventProperties_data_EXP_d), CertIssue = unixtime_seconds_todatetime(eventGridEventProperties_data_NBF_d) | extend DaysTillExpire = datetime_diff("Day", CertExpire, now()) | project ResourceId, CertName = eventGridEventProperties_subject_s, DaysTillExpire, CertExpire, CertIssue
For our Alert we can set our conditions like this. Including the Dimension values for CertName(change cert name to secret name for secret alert) and DaysTill Expire. With an evaluation window of whatever you like. Though I think 1 day is sufficient and will save a tiny fractional amount on cost.
For Secret expiration just changed a few field names along with the operation name
AzureDiagnostics | where OperationName =~ 'SecretNearExpiryEventGridNotification' | extend SecretExpire = unixtime_seconds_todatetime(eventGridEventProperties_data_EXP_d), SecretIssue = unixtime_seconds_todatetime(eventGridEventProperties_data_NBF_d) | extend DaysTillExpire = datetime_diff("Day", SecretExpire, now()) | project ResourceId, SecretName = eventGridEventProperties_subject_s, DaysTillExpire, SecretExpire, SecretIssue
and our fired alert looks like this, with our included Dimension values.
With our Azure Monitor Action Groups we can simultaneously send an Email/SMS/Push. As well as with send to a LogicApp, Azure Function, Azure Automation Runbook, Event Hub, Webhook and Secure Webhook.
One last note, we can also alert on actual expired Secrets and Certs with the operation name(s): CertificateExpiredEventGridNotification, SecretExpiredEventGridNotification.
Now with the power of Azure Monitor, Log Analytics and Azure Policy we can now configure at scale notifications. As well as actions for our Key Vault Secret and Certificate expiry. While also centralizing our logging data for both auditing and reporting purposes.