B
BrunoGabrielli
Ciao Readers,
Just a week no writing . The moment of another blog post arrived .
In this post, I am going to show you how to set up alerts for disconnected Arc agents using Azure Monitor. If you are not familiar with Azure Arc, it is a service that lets you manage and govern your hybrid cloud resources from a single pane of glass. More about it in the Azure Arc overview public documentation page.
One of the benefits of using Arc is that it allows you to collect data from your hybrid resources, so you monitor the health and performance of them. It is ‘a prerequisite’ for enabling Azure Monitor. With that in mind, why it is important to get the alert when a hybrid virtual machine gets disconnected, or the Arc agent status is reported as Offline? Ouch, you did not know they were offline !!!
There are several reasons that spread from management to compliance including monitoring why you need to be aware if your resources are communicating properly or not . Let me give you a few of them:
I just gave you two reasons and given them, I do not think you need any additional one, right? I think you have got the importance of being alerted when an Arc agent gets disconnected as soon as possible by now. Yes, the sooner, the better.
Therefore, you will agree with me that it is necessary to create an alarm. To achieve the goal of creating the alert, you can take advantage of the ability to Create alerts with Azure Resource Graph and Log Analytics.
Let us have a look at the query to be used. The query should give you back one line per monitored server (any alert should give you actionable information and the affected resource is the first in the list) where the last status is reported as Disconnected.
A good query should return records for hybrid machine not connected since a given amount of time. The value in this case is your choice, but I would recommend something not that wide (15 minutes could be a good compromise).
Once you have a good record set, you should configure the alert rule to use the Table rows as Measure and the Count as aggregation type. The Aggregation granularity, which is driving the data range the query will consider, could be set at 1 day
The alert rule logic will be then configured to measure the number of rows returned by the query. The alert will fire if records (even a single one) are returned.
Assuming that your preference will be to get an alert where resources have not been connecting for the last 15 minutes, you create an alert that uses a query similar to the following one:
Running the suggested query, will return something similar to the following image, which will fire the alert in line with the Alert logic condition provided as sample:
I trust you all will be more than able to continue with alert creation; hence I am to stop here avoid consuming your eyes anymore .
Thanks for reading through !!!
Disclaimer
The sample scripts are not supported under any Microsoft standard support program or service. The sample scripts are provided AS IS without a warranty of any kind. Microsoft further disclaims all implied warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample scripts and documentation remains with you. In no event shall Microsoft, its authors, or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample scripts or documentation, even if Microsoft has been advised of the possibility of such damages.
Continue reading...
Just a week no writing . The moment of another blog post arrived .
In this post, I am going to show you how to set up alerts for disconnected Arc agents using Azure Monitor. If you are not familiar with Azure Arc, it is a service that lets you manage and govern your hybrid cloud resources from a single pane of glass. More about it in the Azure Arc overview public documentation page.
One of the benefits of using Arc is that it allows you to collect data from your hybrid resources, so you monitor the health and performance of them. It is ‘a prerequisite’ for enabling Azure Monitor. With that in mind, why it is important to get the alert when a hybrid virtual machine gets disconnected, or the Arc agent status is reported as Offline? Ouch, you did not know they were offline !!!
There are several reasons that spread from management to compliance including monitoring why you need to be aware if your resources are communicating properly or not . Let me give you a few of them:
- When a hybrid virtual machine is onboarded, every connection is authenticated using a Managed Identity created automatically during the onboarding process. This System Assigned Managed Identity is renewed automatically and can be set as expired if the system does not communicate for more than 60 days. Should this be the case, there is no way to reset the identity. You have to offboard and re-onboard the machine together with all the installed extensions and configurations
- When the hybrid machine is disconnected, no monitoring data can be sent. This can lead up to something really bad like:
- Customers go blind about infrastructure health
- Machine will maintain the unsent monitoring data in the local cache on the C drive using up to 10 GB of disk space
- Old, cached data will be deleted so monitoring data loss is expected
- Machines with small disks can quickly and easily run out of disk space. Can you imagine that on a Domain Controller?
I just gave you two reasons and given them, I do not think you need any additional one, right? I think you have got the importance of being alerted when an Arc agent gets disconnected as soon as possible by now. Yes, the sooner, the better.
Therefore, you will agree with me that it is necessary to create an alarm. To achieve the goal of creating the alert, you can take advantage of the ability to Create alerts with Azure Resource Graph and Log Analytics.
Let us have a look at the query to be used. The query should give you back one line per monitored server (any alert should give you actionable information and the affected resource is the first in the list) where the last status is reported as Disconnected.
A good query should return records for hybrid machine not connected since a given amount of time. The value in this case is your choice, but I would recommend something not that wide (15 minutes could be a good compromise).
Once you have a good record set, you should configure the alert rule to use the Table rows as Measure and the Count as aggregation type. The Aggregation granularity, which is driving the data range the query will consider, could be set at 1 day
The alert rule logic will be then configured to measure the number of rows returned by the query. The alert will fire if records (even a single one) are returned.
Assuming that your preference will be to get an alert where resources have not been connecting for the last 15 minutes, you create an alert that uses a query similar to the following one:
Code:
arg("").resources
| where type == "microsoft.hybridcompute/machines"
| where tostring(properties.status) == "Disconnected"
| extend lastContactedDate = todatetime(properties.lastStatusChange)
| where lastContactedDate <= ago(15m)
| extend status = tostring(properties.status)
| project id, Computer=name, status, lastContactedDate
Running the suggested query, will return something similar to the following image, which will fire the alert in line with the Alert logic condition provided as sample:
I trust you all will be more than able to continue with alert creation; hence I am to stop here avoid consuming your eyes anymore .
Thanks for reading through !!!
Disclaimer
The sample scripts are not supported under any Microsoft standard support program or service. The sample scripts are provided AS IS without a warranty of any kind. Microsoft further disclaims all implied warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample scripts and documentation remains with you. In no event shall Microsoft, its authors, or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample scripts or documentation, even if Microsoft has been advised of the possibility of such damages.
Continue reading...