C
Cary_Chai
Azure Container Apps is a powerful platform for running serverless, containerized applications and microservices. In the team’s ongoing commitment to improving Azure Container Apps’ performance, we’ve recently made significant improvements to make the scaling and load balancing behavior of Azure Container Apps more intuitive and better align with customer expectations to meet their performance needs.
Hopefully, some of the insights from our experience in working with Envoy and KEDA will be helpful to you.
In the past, Azure Container Apps relied on the ring hash load balancing algorithm to distribute incoming requests across containers. Ring hash aims for minimal distribution by generating a hash based on request properties to match it with a stable upstream instance. However, as a result, some instances will receive an uneven share of requests. This is especially apparent during load tests where there are a small number of clients and could lead to potential bottlenecks.
To address this issue, we transitioned Azure Container Apps to use the round robin load balancing algorithm when session affinity is not enabled for an app. These are some of the benefits you can expect to see:
We’ve observed a significant improvement in overall system performance since implementing this change, and our customers can expect better resource utilization. When session affinity is enabled, Azure Container Apps still uses a ring hashing algorithm to match sequential requests with a consistent upstream client.
Below is an example of an app running 20 instances handling requests from 1,000 clients. The graph shows the traffic assignment between the two approaches. Both scenarios are with session affinity disabled:
Figure 2. Azure Container Apps running with a round robin load balancing algorithm.
Azure Container Apps uses KEDA and Kubernetes’s Horizontal Pod Autoscale (HPA) to handle scaling of replicas. Customers can set up custom scale rules to determine when their application will scale out. A common rule used by customers is the CPU utilization threshold. For example, you could set a threshold of 80%, so when the average CPU utilization across replicas for your container apps in an environment crossed 80%, the app would scale out.
One challenge some Azure Container Apps customers faced was related to apps not scaling as expected when these CPU utilization thresholds were met due to a 10% built-in tolerance for CPU utilization that is default within HPA. Due to this built-in tolerance, our app set to scale at 80% CPU utilization would only scale once the threshold crossed 88%.
To address this issue, we fine-tuned the HPA configuration and added an offset to the default 10% tolerance to make Azure Container Apps scale as customers expect. For the previous scenario, this means when a container app has a scale rule of 80% CPU utilization, it will scale when the average CPU utilization crosses the 80% threshold as expected instead of at 88%. This change ensures that Azure Container Apps responds more promptly to increased demand and scales out as expected.
The Azure Container Apps team is constantly investing in improving performance. By switching to round-robin load balancing and fine-tuning HPA thresholds, we’ve made ACA more reliable, efficient, and responsive. Please let us know what you think of these changes and what other performance improvements you think we should be making: Azure Container Apps Github.
Thank you for being part of our journey toward a more performant Azure Container Apps!
Continue reading...
Hopefully, some of the insights from our experience in working with Envoy and KEDA will be helpful to you.
Load Balancing Algorithm Update
In the past, Azure Container Apps relied on the ring hash load balancing algorithm to distribute incoming requests across containers. Ring hash aims for minimal distribution by generating a hash based on request properties to match it with a stable upstream instance. However, as a result, some instances will receive an uneven share of requests. This is especially apparent during load tests where there are a small number of clients and could lead to potential bottlenecks.
Switching to Round Robin
To address this issue, we transitioned Azure Container Apps to use the round robin load balancing algorithm when session affinity is not enabled for an app. These are some of the benefits you can expect to see:
- Uniform Request Distribution: Round robin evenly distributes requests among containers, reducing the likelihood that one replica gets overloaded and helps utilize all resources effectively.
- Improved Scalability: With a balanced request load, Azure Container Apps can scale more effectively.
- Predictable Behavior: Developers can now rely on more consistent behavior across containers, simplifying troubleshooting and monitoring.
We’ve observed a significant improvement in overall system performance since implementing this change, and our customers can expect better resource utilization. When session affinity is enabled, Azure Container Apps still uses a ring hashing algorithm to match sequential requests with a consistent upstream client.
Below is an example of an app running 20 instances handling requests from 1,000 clients. The graph shows the traffic assignment between the two approaches. Both scenarios are with session affinity disabled:
Figure 2. Azure Container Apps running with a round robin load balancing algorithm.
Horizontal Pod Autoscale Thresholds
Azure Container Apps uses KEDA and Kubernetes’s Horizontal Pod Autoscale (HPA) to handle scaling of replicas. Customers can set up custom scale rules to determine when their application will scale out. A common rule used by customers is the CPU utilization threshold. For example, you could set a threshold of 80%, so when the average CPU utilization across replicas for your container apps in an environment crossed 80%, the app would scale out.
One challenge some Azure Container Apps customers faced was related to apps not scaling as expected when these CPU utilization thresholds were met due to a 10% built-in tolerance for CPU utilization that is default within HPA. Due to this built-in tolerance, our app set to scale at 80% CPU utilization would only scale once the threshold crossed 88%.
Adjusting Tolerance Levels
To address this issue, we fine-tuned the HPA configuration and added an offset to the default 10% tolerance to make Azure Container Apps scale as customers expect. For the previous scenario, this means when a container app has a scale rule of 80% CPU utilization, it will scale when the average CPU utilization crosses the 80% threshold as expected instead of at 88%. This change ensures that Azure Container Apps responds more promptly to increased demand and scales out as expected.
Conclusion
The Azure Container Apps team is constantly investing in improving performance. By switching to round-robin load balancing and fine-tuning HPA thresholds, we’ve made ACA more reliable, efficient, and responsive. Please let us know what you think of these changes and what other performance improvements you think we should be making: Azure Container Apps Github.
Thank you for being part of our journey toward a more performant Azure Container Apps!
Continue reading...