Guest daisami Posted August 23, 2022 Posted August 23, 2022 This article is a part of series articles for Azure Monitor. Please refer to How to leverage Azure Monitor to meet functional and non-functional requirements - No.1 overview first before reading this post. This post dives deeply for Compute category among monitoring categories as highlighted blue. Article No monitoring category monitoring target Note 2 compute Reboot monitor reboot frequency CPU monitor CPU usage Memory monitor memory usage 3 compute/ inside OS log file monitor event log and syslog Process monitor available process 4 Storage/Disk Disk monitor disk usage folder/file monitor folder usage and file size 5 Endpoint/IPv4 address response/service monitor specific address and port Web site Scenario monitor web scenario 6 Network Connectivity monitor vNiC and VNET peering Firewall monitor Azure Firewall rule usage 7 Backup Backup monitor backup status Azure Resources Resource health monitor resource availability There are three monitoring targets on Compute monitoring objective as follows. Reboot CPU Memory There are some options to monitor them for each. Let's dive deeply for them. 1. Reboot monitoring We will try three options below to monitor Reboot monitoring. Azure Monitor for VMs Activity Log Resource Health 1.1 Azure Monitor for VMs for reboot monitoring You can retrieve logs on Log Analytics workspace in 10-30 min as usual if you have configure Azure Monitor for your VM. Note that it might takes 8 to 10 hours right after setting up Log Analytics workspace. Run Kusto query to take logs within 5 min, which "Name" is "HeartBeat" and "Computer" is "CentOSVM01" on "InsightsMetrics" table. InsightsMetrics | where Name == "Heartbeat" | where Computer == "CentOSVM01" | where TimeGenerated > ago(5m) | order by TimeGenerated You can check that agents send heartbeats per about 1 min. This record indicates that OS on the VM is running. We need to send notification with alerts if the heartbeats might not show up. Minimum granularity for alerts is 1 min. Setup action group or create new one, and configure to send notification to your email address. You can check this alert rule status on Log Analytics page. This email is delivered in 9 min after shutting down your VM. We setup granularity as 5 min, but it takes about 9 min as total. The email is described based on default templates, but we can send a customized mail by using PaaS services for example LogicApps, Azure Automation, or others. The email was permanently delivered every 5 min, so we can disable the rule on the portal as follow. Don't forget to enable the rule again after restarting your VM. You can setup durations for logs on portal as follow if you haven't specify with Kusto queries. You can visualize data as charts by checking "chart" tab. You can download the logs as CSV and Excel format files. Trigger can be setup from action on action group. It means that you can use Azure Functions or Logic Apps with alert detections, thus you can extend your operations by using your apps on the PaaS services. Trigger complex actions with Azure Monitor alerts - Azure Monitor Tutorial: send email with Logic Apps - Azure App Service Reboot monitoring can send a notification and have primary action with the trigger at once not only send a template mail. You can also send a customized mail by using Logic Apps. 1.2 Activity Log for reboot monitoring Configure ServiceHealth and ResourceHealth categories to send logs to Log Analytics workspace on "Monitor | Activity Log" page. Refer to a screenshot as follows to setup. Azure Blob Storage might be enough to simply store log, but to store logs on Log Analytics workspace allow you to retrieve and analyze logs. Reboot log is generated only when users run reboot operation on Azure Portal, thus we can't use the log for OS issue, OS updates with reboot, or reboot operation on OS. AzureActivity | where CategoryValue == "ResourceHealth" or CategoryValue == "ServiceHealth" | where Properties contains "Rebooted" | where TimeGenerated between(datetime("2022-07-03 00:00:00") .. datetime("2022-08-11 17:00:00")) | order by TimeGenerated desc 1.3 Resource Health Service Health is useful to validate status of Azure resources. You can add an alert rule on Service Health. Alert rules on Service Health can send a mail when Azure platform recognizes the resource as unhealthy including reboot, shutdown and others. Finally, here is check result of Reboot monitoring. Type category Goal and outcome Result 1 monitoring Azure Monitor can satisfy functional requirements OK 2 Azure Monitor can setup short granularity for detections 1 min 3 Azure Monitor can setup thresholds detections OK 4 Azure Monitor can setup retry detections OK 5 Azure Monitor can suspend and resume for checking threshold OK 6 Azure Monitor can send a mail for detection results OK 7 statistics Azure Monitor can retrieve workspace logs with specific duration OK 8 Azure Monitor can visualize statistic data OK 9 automation Azure Monitor can have primary action based on alert rules OK 10 Azure Monitor can send validation results OK 2. CPU monitoring Here is an option to monitor CPU usage. "Percentage CPU" VM metric 2.1 "Percentage CPU" VM metric Choose "Percentage CPU" metric on your VM menu. Choose "Ave" or "Max", and configure to send a notification when CPU usage Ave or Max is xx% or higher on "New alert rule". You can reuse an alert rule, which you created for reboot monitoring. Azure Monitor fire triggers based on its tailored thresholds if you use Dynamics Threshold. Create alerts with Dynamic Thresholds in Azure Monitor - Azure Monitor Choose static threshold if you have to align with company policies or specific system policies for example CPU usage is 80% or higher. It takes about 4 minutes to receive a mail when your VM becomes high CPU usage. Here is an example to put heavy CPU load to the VM. You can disable the your rules on Azure Portal. You can configure any time range of the graphs by choosing "Custom" as "Time range" on Virtual Machine metric. Finally, here is check result of CPU monitoring. Type category Outcome and goal Result 1 monitoring Azure Monitor can satisfy functional requirements OK 2 Azure Monitor can setup short granularity for detections 1 min 3 Azure Monitor can setup thresholds detections OK 4 Azure Monitor can setup retry detections OK 5 Azure Monitor can suspend and resume for checking threshold OK 6 Azure Monitor can send a mail for detection results OK 7 statistics Azure Monitor can retrieve workspace logs with specific duration OK 8 Azure Monitor can visualize statistic data OK 9 automation Azure Monitor can have primary action based on alert rules OK 10 Azure Monitor can send validation results OK 3. memory usage monitoring Here are some options to monitor memory usage. Performance counter on Log Analytics Agent (Preview) VM metrics 3.1 Performance counter on Log Analytics agent Configure performance counter on "Agents configuration" of Log Analytics. Then, find out data tables for memory usage by putting "memory" on search box. LogManagement tables are populated based on the configuration after a while. "% Available Memory" is memory usage percentage. "Used Memory Mbytes" is memory usage(MB). Here is an example query, which search VM has less than 20 & available memory. Healthy VM, which have 20% or higher available memory, won't show up as a record. Perf | where Computer == "CentOSVM01" | where CounterName == "% Available Memory" | where CounterValue < 20 | order by TimeGenerated desc Do not confuse the value when you configure threshold of alert. In the alert rule settings, threshold value is specified as the number of rows in the result of query search. The threshold value of the condition is set when one or more lines are output. Note that xx% memory usage is not the value to set as threshold. You can disable the your rules on Azure Portal and configure any time range of the graphs by choosing "Custom" as "Time range" on Virtual Machine metric like CPU monitoring scenario. 3.2 (Preview) VM metrics Choose "Available Memory Byte (Preview)" metric on your VM menu. This is almost same setting with CPU usage. Finally, here is check result of memory monitoring. Type Category Outcome and goal Result 1 monitoring Azure Monitor can satisfy functional requirements OK 2 Azure Monitor can setup short granularity for detections 1 min 3 Azure Monitor can setup thresholds detections OK 4 Azure Monitor can setup retry detections OK 5 Azure Monitor can suspend and resume for checking threshold OK 6 Azure Monitor can send a mail for detection results OK 7 statistics Azure Monitor can retrieve workspace logs with specific duration OK 8 Azure Monitor can visualize statistic data OK 9 automation Azure Monitor can have primary action based on alert rules OK 10 Azure Monitor can send validation results OK Now, we can start these series articles for Azure Monitor. In next post, we will dive deep to "compute/ inside OS" monitoring objective. Special thanks for this post. [attachment=21004:name] Avanade Japan K.K., Director - Japan Microsoft Azure Platform Services Lead & Japan Azure CoE Lead. Continue reading... Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.