Jump to content

How to leverage Azure Monitor to meet functional and non-functional requirements - No.2 Compute


Recommended Posts

Guest daisami
Posted

This article is a part of series articles for Azure Monitor. Please refer to How to leverage Azure Monitor to meet functional and non-functional requirements - No.1 overview first before reading this post. This post dives deeply for Compute category among monitoring categories as highlighted blue.

 

Article No
monitoring category

monitoring target

Note

2

compute

Reboot

monitor reboot frequency





CPU

monitor CPU usage





Memory

monitor memory usage

3

compute/ inside OS

log file

monitor event log and syslog





Process

monitor available process

4

Storage/Disk

Disk

monitor disk usage





folder/file

monitor folder usage and file size

5

Endpoint/IPv4 address

response/service

monitor specific address and port



Web site

Scenario

monitor web scenario

6

Network

Connectivity

monitor vNiC and VNET peering





Firewall

monitor Azure Firewall rule usage

7

Backup

Backup

monitor backup status



Azure Resources

Resource health

monitor resource availability

 

 

 

There are three monitoring targets on Compute monitoring objective as follows.

 

  1. Reboot
  2. CPU
  3. Memory

 

There are some options to monitor them for each. Let's dive deeply for them.

 

1. Reboot monitoring

 

 

We will try three options below to monitor Reboot monitoring.

 

  1. Azure Monitor for VMs
  2. Activity Log
  3. Resource Health

1.1 Azure Monitor for VMs for reboot monitoring

 

 

You can retrieve logs on Log Analytics workspace in 10-30 min as usual if you have configure Azure Monitor for your VM. Note that it might takes 8 to 10 hours right after setting up Log Analytics workspace.

 

mediumvv2px400.png.c848ffd55b51e77ac1538032ffc90e20.png

 

Run Kusto query to take logs within 5 min, which "Name" is "HeartBeat" and "Computer" is "CentOSVM01" on "InsightsMetrics" table.

 

mediumvv2px400.png.eb774ac41ef9cdd4b75f042908015f82.png

 

 

 

 

 

InsightsMetrics

| where Name == "Heartbeat"

| where Computer == "CentOSVM01"

| where TimeGenerated > ago(5m)

| order by TimeGenerated

 

 

 

 

 

 

 

You can check that agents send heartbeats per about 1 min. This record indicates that OS on the VM is running. We need to send notification with alerts if the heartbeats might not show up. Minimum granularity for alerts is 1 min. Setup action group or create new one, and configure to send notification to your email address. You can check this alert rule status on Log Analytics page.

 

mediumvv2px400.png.9f223e7cc0f5a967279e902d8b8f0e9a.png

 

This email is delivered in 9 min after shutting down your VM. We setup granularity as 5 min, but it takes about 9 min as total.

 

mediumvv2px400.png.4f92b8ddeb4aa5989469c7080f8bd91e.png

 

The email is described based on default templates, but we can send a customized mail by using PaaS services for example LogicApps, Azure Automation, or others. The email was permanently delivered every 5 min, so we can disable the rule on the portal as follow. Don't forget to enable the rule again after restarting your VM.

 

mediumvv2px400.png.54e30662598f11d65ab654611ca4d2c5.png

 

You can setup durations for logs on portal as follow if you haven't specify with Kusto queries. You can visualize data as charts by checking "chart" tab. You can download the logs as CSV and Excel format files.

 

mediumvv2px400.png.11ece7ad46ae62bec171746a9194e4bb.png

 

Trigger can be setup from action on action group. It means that you can use Azure Functions or Logic Apps with alert detections, thus you can extend your operations by using your apps on the PaaS services.

 

mediumvv2px400.png.97f0a24dbd2069dbb3a12906e4588aa3.png

 

Trigger complex actions with Azure Monitor alerts - Azure Monitor

Tutorial: send email with Logic Apps - Azure App Service

 

Reboot monitoring can send a notification and have primary action with the trigger at once not only send a template mail. You can also send a customized mail by using Logic Apps.

 

 

 

1.2 Activity Log for reboot monitoring

 

 

Configure ServiceHealth and ResourceHealth categories to send logs to Log Analytics workspace on "Monitor | Activity Log" page. Refer to a screenshot as follows to setup. Azure Blob Storage might be enough to simply store log, but to store logs on Log Analytics workspace allow you to retrieve and analyze logs.

 

mediumvv2px400.png.9add60c67ed51e78142c44c00138c480.png

 

Reboot log is generated only when users run reboot operation on Azure Portal, thus we can't use the log for OS issue, OS updates with reboot, or reboot operation on OS.

 

mediumvv2px400.png.8e85b1d7e342f8b6c927a858002e9216.png

 

 

 

 

 

AzureActivity

| where CategoryValue == "ResourceHealth" or CategoryValue == "ServiceHealth"

| where Properties contains "Rebooted"

| where TimeGenerated between(datetime("2022-07-03 00:00:00") .. datetime("2022-08-11 17:00:00"))

| order by TimeGenerated desc

 

 

 

 

 

 

 

1.3 Resource Health

 

 

Service Health is useful to validate status of Azure resources. You can add an alert rule on Service Health.

 

mediumvv2px400.png.cddcb184546d8ee24b49e4cf4bb9e3d0.png

 

Alert rules on Service Health can send a mail when Azure platform recognizes the resource as unhealthy including reboot, shutdown and others.

 

 

 

Finally, here is check result of Reboot monitoring.

 

Type
category

Goal and outcome

Result

1

monitoring

Azure Monitor can satisfy functional requirements

OK

2



Azure Monitor can setup short granularity for detections

1 min

3



Azure Monitor can setup thresholds detections

OK

4



Azure Monitor can setup retry detections

OK

5



Azure Monitor can suspend and resume for checking threshold

OK

6



Azure Monitor can send a mail for detection results

OK

7

statistics

Azure Monitor can retrieve workspace logs with specific duration

OK

8



Azure Monitor can visualize statistic data

OK

9

automation

Azure Monitor can have primary action based on alert rules

OK

10



Azure Monitor can send validation results

OK

 

 

 

2. CPU monitoring

 

 

Here is an option to monitor CPU usage.

 

  1. "Percentage CPU" VM metric

2.1 "Percentage CPU" VM metric

 

 

Choose "Percentage CPU" metric on your VM menu. Choose "Ave" or "Max", and configure to send a notification when CPU usage Ave or Max is xx% or higher on "New alert rule".

 

mediumvv2px400.png.73f4a9b63dddf90bbe1fd46360de6a64.png

 

You can reuse an alert rule, which you created for reboot monitoring. Azure Monitor fire triggers based on its tailored thresholds if you use Dynamics Threshold.

 

mediumvv2px400.png.8da7ce20e0cc50cce62449eaca01d66f.png

 

Create alerts with Dynamic Thresholds in Azure Monitor - Azure Monitor

Choose static threshold if you have to align with company policies or specific system policies for example CPU usage is 80% or higher.

 

It takes about 4 minutes to receive a mail when your VM becomes high CPU usage. Here is an example to put heavy CPU load to the VM.

 

mediumvv2px400.png.fb77aeee068679b3b7803c4f5887167d.png

 

You can disable the your rules on Azure Portal.

 

mediumvv2px400.png.fa4ffa1142e320d97b2f42db888fa067.png

 

You can configure any time range of the graphs by choosing "Custom" as "Time range" on Virtual Machine metric.

 

mediumvv2px400.png.ef4f51d04fb9e11ee2b7157836a720aa.png

 

 

 

Finally, here is check result of CPU monitoring.

 


Type

category

Outcome and goal

Result

1

monitoring

Azure Monitor can satisfy functional requirements

OK

2



Azure Monitor can setup short granularity for detections

1 min

3



Azure Monitor can setup thresholds detections

OK

4



Azure Monitor can setup retry detections

OK

5



Azure Monitor can suspend and resume for checking threshold

OK

6



Azure Monitor can send a mail for detection results

OK

7

statistics

Azure Monitor can retrieve workspace logs with specific duration

OK

8



Azure Monitor can visualize statistic data

OK

9

automation

Azure Monitor can have primary action based on alert rules

OK

10



Azure Monitor can send validation results

OK

 

 

 

3. memory usage monitoring

 

 

Here are some options to monitor memory usage.

 

  1. Performance counter on Log Analytics Agent
  2. (Preview) VM metrics

3.1 Performance counter on Log Analytics agent

 

 

Configure performance counter on "Agents configuration" of Log Analytics. Then, find out data tables for memory usage by putting "memory" on search box.

LogManagement tables are populated based on the configuration after a while. "% Available Memory" is memory usage percentage. "Used Memory Mbytes" is memory usage(MB).

 

mediumvv2px400.png.e66b39d0a96619d3af76f37fc8cf99bc.png

 

Here is an example query, which search VM has less than 20 & available memory. Healthy VM, which have 20% or higher available memory, won't show up as a record.

 

 

 

 

 

Perf

| where Computer == "CentOSVM01"

| where CounterName == "% Available Memory"

| where CounterValue < 20

| order by TimeGenerated desc

 

 

 

 

 

mediumvv2px400.png.7d7f764e3678c165ba9ee97267546fd7.png

 

Do not confuse the value when you configure threshold of alert. In the alert rule settings, threshold value is specified as the number of rows in the result of query search. The threshold value of the condition is set when one or more lines are output. Note that xx% memory usage is not the value to set as threshold.

You can disable the your rules on Azure Portal and configure any time range of the graphs by choosing "Custom" as "Time range" on Virtual Machine metric like CPU monitoring scenario.

 

3.2 (Preview) VM metrics

 

 

Choose "Available Memory Byte (Preview)" metric on your VM menu. This is almost same setting with CPU usage.

 

mediumvv2px400.png.70db62b8010597e22729142658fd2dc5.png

 

 

 

Finally, here is check result of memory monitoring.

 


Type

Category

Outcome and goal

Result

1

monitoring

Azure Monitor can satisfy functional requirements

OK

2



Azure Monitor can setup short granularity for detections

1 min

3



Azure Monitor can setup thresholds detections

OK

4



Azure Monitor can setup retry detections

OK

5



Azure Monitor can suspend and resume for checking threshold

OK

6



Azure Monitor can send a mail for detection results

OK

7

statistics

Azure Monitor can retrieve workspace logs with specific duration

OK

8



Azure Monitor can visualize statistic data

OK

9

automation

Azure Monitor can have primary action based on alert rules

OK

10



Azure Monitor can send validation results

OK

 

 

 

Now, we can start these series articles for Azure Monitor. In next post, we will dive deep to "compute/ inside OS" monitoring objective.

 

 

 

 

 

Special thanks for this post.

 

[attachment=21004:name] Avanade Japan K.K., Director - Japan Microsoft Azure Platform Services Lead & Japan Azure CoE Lead.

 

Continue reading...

mediumvv2px400.png.acb90016097bdbbd39c5b643c4aec972.png

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...