Troubleshoot Policy Issues for Sending logs to Log Analytics/Storage Account for Azure Resources

  • Thread starter Thread starter Shikhaghildiyal
  • Start date Start date
S

Shikhaghildiyal

Azure Custom Policies



Azure custom policies are used to extend and customize the behavior of Azure services to meet specific requirements that are not addressed by the built-in policies.



Reference on how to fix common issues while creating custom policies can be found here : Troubleshooting Common Custom Policy Issues in Policy Development



Custom Policy for Sending logs to Log analytics workspace and Storage Account



We use custom policies to send logs to a Log Analytics workspace and a Storage Account in Azure to ensure that all required logs are consistently and automatically collected and stored for analysis, auditing, and compliance purposes. By using custom policies for sending logs to a Log Analytics workspace and Storage Account, organizations can enhance their logging and monitoring capabilities, improve security and compliance, and streamline log management processes.



How to Troubleshoot common issues in Custom Policy for sending logs to Log analytics workspace and Storage Account



While adding custom policy and assigning it to particular subscription, we might encounter multiple issues in compliance report which can be sometimes hard to understand and fix. Here is the list of Use cases and fixes which can help us quickly in identifying the root cause and solving the issues.



Use Case 1 - Logs are enabled, still resource is marked as non- compliant



Below is an example of function apps, where logs are enabled, however resource is still marked as non- compliant.



Shikhaghildiyal_0-1722691758265.png



Shikhaghildiyal_1-1722691795222.png





Fix: Please check the diagnostic settings and make sure that all logs are enabled as per the policy rule mentioned in your policy definition. If all Logs are not enabled, policy rule will not match with the settings, and resource will be marked as non- compliant



Now, as per below screenshot for functions apps, here we see that all logs are not enabled-



Shikhaghildiyal_2-1722691846545.png



To fix this issue, please go back to policy definition and add entry for all logs here inside “logs” section. Please note that the logs category will vary for different products. Always make sure to check logs category before adding logs policy



Shikhaghildiyal_3-1722691888493.png



Use Case 2- Reason for non- compliance is incorrect for the resources where logs are not enabled

There will be few cases, where you will see that resource is non-compliant however the reason for non-compliance is not correct.



Below is screenshot for reason for non-compliance, which is very common issue that has been observed in multiple products

Current Value is marked as “true” and “false” both.



Shikhaghildiyal_4-1722691933681.png



Fix: To fix this issue, please check if your product has multiple logs category or not and follow below steps



1. If Resource has Multiple Logs Category

If your product has multiple logs category, then you must add “count” variable for your policy definition, so that it can check all logs, count the value and base upon that mark the resource as compliant or non-compliant

For example, here as per below screenshot, for batch account we have two different logs category, which is greater than 1. Here we need to put the count variable so that it can count total logs and then reflect correct compliance report



Shikhaghildiyal_5-1722691971333.png



Here in policy definition, we have put count variable for “All logs” only in “category group” as we don’t have multiple categories for “All Metrics”. If you have multiple categories for “All Metrics”, then you can put count variable for “all metrics” as well



Shikhaghildiyal_6-1722691995970.png



Shikhaghildiyal_7-1722692011951.png



Please make a note to check if “All Logs” category is a part of “category group” or not. You can always check JSON view for diagnostic settings to confirm that



Shikhaghildiyal_8-1722692041444.png



2. If Resource has All logs and Audit Logs



All logs category and Audit logs Category varies for each product. For few products you will notice that All logs include Audit Logs as well which means if you enable All logs, Audit logs are enabled automatically, however for few products All logs and Audit Logs category are completely different. This means you must explicitly enable All logs as well as Audit logs and add the same logic to policy definition. If you fail to do that, compliance report will mark your resources as non- complaint with reason as [true, false]. True false means, few logs are enabled, and few are not enabled.



As per the below screenshot, Logs are enabled, however Audit logs are not enabled. This is a scenario, where compliance report will mark your resource as non-compliant because logs are partially enabled, hence reason for no compliance will be true and false



Shikhaghildiyal_9-1722692105842.png

Shikhaghildiyal_10-1722692117216.png

Also, where no logs are enabled, resource will be marked as non- compliant, with same reason as true and false because policy definition is only checking All logs and not Audit logs



Shikhaghildiyal_11-1722692152436.png



Fix: To fix this issue, please update your policy definition to include Audit logs as well after rechecking JSON view of diagnostic settings and getting confirmation that category groups are separately defined for Audit and All logs as per below screenshot



Shikhaghildiyal_12-1722692175239.png



Updated policy definition will look like this



Shikhaghildiyal_13-1722692203297.png





Once you update the policy definition, you will see the compliance report is marked with correct reason for non-compliant resources where no logs are enabled. Compliant reason will look like this



Shikhaghildiyal_14-1722692228867.png

Also, for resources, where logs are enabled partially, once remediation task is run, logs will be enabled fully, and resources will be marked as compliant.





Use case 3 - If there are no multiple logs category and non-compliant reason is misleading

If your non-compliant resource is showing reason as “true, false” and the above-mentioned Use case is not matching the criteria, then try to add same policy in some other subscription and check the reason for compliance, if the reason for non-compliant is changing then we have to check the HAR trace logs for checking the values for get and post requests and take a decision.



There might be a possibility that it is a UI issue, and you must connect with PG team to understand the fix for the same



For example, please see below screenshot for logs policy. Here the non-compliant reason changes in two different subscriptions however the policy for sending logs to log analytics remains same in logic for both subscriptions.



In First screenshot reason for non-compliance is blank however in second screenshot for same policy, reason for non-compliance is blank but it says "no related resources match the effect detail which is misleading. This is UI issue; policy logic is correct here.



Shikhaghildiyal_0-1722692698747.png



Shikhaghildiyal_16-1722692297523.png

Continue reading...
 
Back
Top