Seamless Recovery: How to Automate Azure VM Evictions Start Ups with Azure Functions

  • Thread starter Thread starter wernerrall
  • Start date Start date
W

wernerrall

Introduction

Azure has some incredible services that we can use for all business sizes and even budgets. One of these amazing services we find is a highly discounted virtual machine called a spot instance. A spot instance in essence is a special kind of Virtual Machine that can at any time be evicted when capacity is required for the standard or default Virtual Machines. Why would I want to run it then? Because it is super cheap! How cheap? In some cases, up to 90% Off.



Here are a few key points to consider:

  • Cost Savings: Spot VMs can provide discounts of up to 90% compared to regular pricing, though typical savings are often in the range of 60-80%.
  • Pricing Fluctuation: The price of Spot VMs fluctuates based on supply and demand in the Azure data centers. When demand is low, prices drop, making them extremely cost-effective.
  • Availability: Spot VMs can be evicted when Azure needs the capacity for other VMs, so they are best suited for non-critical workloads that can handle interruptions.
  • Use Cases: Spot VMs are ideal for batch processing, testing and development, stateless applications, and any other workloads that can be paused and resumed.



Now in the case of these Spot VMs, how can I try and start up my VMs once they have been evicted off their current host. Well, I can use Azure Functions to help me with this!



Requirements

  1. A spot instance Virtual Machine
  2. An Azure Function with a managed Identity that has VM Contributor Role
  3. Access to the Activity Log so we can create an Alert
  4. Access to Action Groups in Azure Monitor to create the action that will call the Function



Let’s go!



  1. Here is my SPOT VM already created.



large?v=v2&px=999.png





If you do not know which VMs you have that are SPOT VMs feel free to use the below Resource Graph Query that will show SPOT VMs. You can also find it in GitHub à RallTheory/AzureFunctions/AllSpotVMsGraphQuery.KQL at main · WernerRall147/RallTheory (github.com)







Code:
resources
| where type == 'microsoft.compute/virtualmachines'
| where properties has 'evictionPolicy'
| project
    resourceGroup,
    vmName = name,
    vmSize = tostring(properties.hardwareProfile.vmSize),
    osType = tostring(properties.storageProfile.osDisk.osType),
    evictionPolicy = tostring(properties.extended.instanceView.powerState.code)







large?v=v2&px=999.png





  1. Let’s create our Azure Alerts for Spot VMs. This requires an extra step because we do not know what an Eviction even looks like?

To simulate an eviction we will run this command in the Azure Coud Shell but you can run it anywhere.





az vm simulate-eviction -g <yourresourcegroup> -n <yourserver>







large?v=v2&px=999.png





  1. If we wait between 5 – 10 Minutes we can see what eviction looks like in the Activity Log of the Virtual Machine.



large?v=v2&px=999.png



  1. Next we click on the EvictSPOTVM activity and create an alert for this activity



large?v=v2&px=999.png





We do have to modify some of the Parameters because I want my function to run always.



large?v=v2&px=999.png





large?v=v2&px=999.png





large?v=v2&px=999.png





large?v=v2&px=999.png



large?v=v2&px=999.png





large?v=v2&px=999.png





medium?v=v2&px=400.png

Click review + create for now. We will come back here when our function has been created.



  1. I created an Azure Function app with the HttpTrigger Function. The only requirement here is that your Azure function is a PowerShell function. More can be seen here Create a PowerShell function using Visual Studio Code - Azure Functions | Microsoft Learn



large?v=v2&px=999.png





large?v=v2&px=999.png





The code I used can be found in GitHub repo RallTheory/AzureFunctions/AzFunction_SpotVM_Evicted_StartUp.ps1 at main · WernerRall147/RallTheory (github.com)



  1. Now that our function has been created successfully, we need to instruct the Alert and Action Group to call the Function. We go back to our Resource Group and find our Action Group and make the changes in the actions section of our action group. We point it to our Azure Function and save.



large?v=v2&px=999.png





large?v=v2&px=999.png





large?v=v2&px=999.png





Now we are all set!



So what happens now?

In the event of an eviction happening the Azure Alert will call the action group.

The action group will call the function.

The Function will see if the VMs have been evicted and will try to bring the servers back up. (In the future I will be working on some backoff or circuit breaker patterns for this)



What does it look like on our VM?

First of all the VM is running.



medium?v=v2&px=400.png





Secondly, we can see in the Activity Log that the Azure Function is what started my VM.



large?v=v2&px=999.png





This opens many other possibilities for us and allows us to think in a cloud way of only provisioning resources we need, at the time we need. It also shows the power of Azure Functions and that you do not need to be some Level 400 Developer to write advanced scripts.



Disclaimer

The sample scripts are not supported under any Microsoft standard support program or service. The sample scripts or Power BI Dashboards are provided AS IS without warranty of any kind. Microsoft further disclaims all implied warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample scripts and documentation remains with you. In no event shall Microsoft, its authors, or anyone else involved in the creation, production, or delivery of the scripts or Power BI Dashboards be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample scripts or documentation, even if Microsoft has been advised of the possibility of such damages. This blog post was written with the help of generative AI.


Continue reading...
 
Back
Top