Enhancing Resilience in Azure SQLDB Hyperscale Named Replicas with High Availability (HA) replica.

Attinder_Pal_Singh · Oct 26, 2024

In this blog post, we will explore how to manage the resiliency and high availability of read workloads offloaded to a Hyperscale named replica. Offloading read workloads to a read replica has many use cases. We'll discuss a scenario involving Contoso, an energy company that has offloaded their read APIs workloads to named replicas. Contoso uses dedicated named replicas for multiple APIs and reporting workloads to fetch details such as power outage status, energy usage, billing information, business reporting, etc.

Additionally, we'll explore how to create high availability (HA) replicas for Hyperscale named replicas using the Azure Portal, PowerShell, and Azure CLI.

Brief about named replica: A named replica is a compute-only secondary replica available exclusively in the Azure SQL Database Hyperscale tier. It can be
added on-demand to offload read workloads from the primary Hyperscale replica. Since it uses the same storage as the primary Hyperscale
replica, it does not incur additional storage or license costs. Read applications can connect to a named replica via a dedicated connection
endpoint, and isolated access can be configured for the users. Each named replica can have a different compute size or service level
objective (SLO). It also has its own buffer pool and SSD cache to retain hot data pages in memory or local to the compute node, enhancing
read performance. Read more about named replica

Scenario: Contoso experienced few application connectivity issues due to unplanned outages, such as process crash or load balancing. Azure’s infrastructure can dynamically reconfigure servers when heavy workloads in the SQL Database service, which might cause your application to lose their connection to the database. These errors, known as transient faults, can be investigated and managed by following the practices outlined in the article Troubleshoot common connection issues.

However, Contoso’s issue was prolonged because they didn’t have a high availability (HA) secondary replica for their named replica. Without an HA secondary replica, during an unplanned event, a new named replica is provisioned and recovers automatically, but the failover process can take from a few seconds to minutes, during which the application cannot connect to the named replica. Since Contoso directed critical read workloads to named replicas, even brief outage causes incidents for their customers.

To prevent such prolonged incidents, Contoso added an HA secondary replica for their named replica. This setup ensures that during unplanned outages, automatic failover to the HA secondary replica minimizes downtime. Additionally, Contoso could use the HA secondary replica to handle read workloads from APIs, further enhancing their system performance.

NOTE: The HA secondary replica will have the same compute size as the named replica and will scale accordingly when you scale the named replica's compute.
Application directing read workload to a named replica can load balance the workload on its HA secondary replica using the connection
string parameter "ApplicationIntent=ReadOnly".

Let’s discuss the steps to create HA secondary replica for an existing named replica.

Azure Portal

You can create a HA secondary replica for a named replica, you can refer following steps as shown in the animation:

Go to the Azure portal and select your Hyperscale database.
Under Data Management, select the Replicas. Choose the existing named replica.
Under Settings, select the Compute + storage.
Add desired number of High-Availability secondary replicas to your configuration and select Apply.

PowerShell

The -HighAvailabilityReplicaCount input parameter in ‘Set-AzSqlDatabase’ can be used to add HA secondary replica for a named replica. For more information, see Set-AzSQLDatabase.

To validate if his property is enabled, you can use PowerShell: ‘Get-AzSqlDatabase’. For more information, see Get-AzSqlDatabase (Az.Sql).

Example:

Add a HA secondary replica for a named replica contoso_named_replica_db under logical server contososerver.

Set-AzSqlDatabase  -ResourceGroupName contosorg -DatabaseName contoso_named_replica_db -ServerName contososerver -HighAvailabilityReplicaCount 1

To validate if HA secondary replica is added for a named replica.

Get-AzSqlDatabase -ResourceGroupName contosorg -DatabaseName contoso_named_replica_db  -ServerName contososerver | Select-Object -ExpandProperty HighAvailabilityReplicaCount

Azure CLI

By using the --ha-replicas input parameter in the ‘az sql db replica create’ command, you can add a HA secondary replica for a named replica. For more details on this CLI command, see az sql db replica.

To validate if HA secondary replica is added, you can use the CLI command: 'az sql db show'. For more information, see az sql db show.

Example:

To add a HA secondary replica for a named replica contoso_named_replica_db.

az sql db replica create -g contosorg -n contoso_named_replica_db -s contososerver --secondary-type named --partner-database contoso_named_replica_db --partner-server contososerver --ha-replicas 1

To validate if the HA secondary replica has been added.

az sql db show -g contosorg -n contoso_named_replica_db -s contososerver --query "highAvailabilityReplicaCount"

Limitations:

Named replicas and their HA replica are not automatically load balanced currently. To utilize HA secondary replica for read operations, the application connection string must include the parameter "ApplicationIntent = ReadOnly".
Named replicas do not support Geo failover or failover groups currently. In the event of a Geo failover where the primary Hyperscale database fails over to a Geo secondary replica, you have two options:
1. Add a named replica for the Geo secondary replica when it becomes primary to save compute costs by provisioning the replica only when required.
2. Pre-provision a named replica for the Geo secondary replica to ensure both the primary and Geo secondary environments have the same configuration and replica topology.
Please note that the connection string or endpoint for each named replica is unique. Therefore, you will need to update the application connection string for your application pointing to a named replica on the Geo secondary replica with the new named replica endpoint provisioned for the Geo secondary replica.

FAQs: HA secondary replicas for named replicas:

Question: If we scale a named replica from 40 to 80 cores, will the HA secondary replicas also scale automatically?

Answer: Yes, if a named replica is scaled up or down, the associated HA secondary replicas will scale automatically without any manual intervention.

Question: In the absence of an HA secondary replica, if there is a planned failover (e.g., for maintenance), will the new named replica compute be created before the failover to minimize downtime?

Answer: Yes, for planned failovers, a temporary secondary replica is created automatically, and a hot failover is performed.

Question: If there is an unplanned but manageable failover, such as load balancing, will the failover take more time because a new compute needs to be created, or will the system create a new compute before the failover?

Answer: Load balancing is considered a planned failover. In this situation, a temporary secondary replica is created automatically, and a hot failover is performed.

Question: If there is an unplanned failover due to a failure or problem with the compute, will the failover take more time because a new compute needs to be created?

Answer: Yes, if the failover is unplanned (e.g., due to SQL crashes or replica node failures), it will take more time because a new SQL instance needs to be created and recovered on a different node.

Question: Do the Log service and Page servers have HA secondary replicas by default, independent of the compute HA replica?

Answer: Yes, they maintain internal replicas to ensure resiliency, regardless of whether you add an HA replica.

Conclusion

Creating HA replica for your Hyperscale named replica is a straightforward process that can significantly enhance the reliability and performance of your applications. Whether you use the Azure Portal, PowerShell, or Azure CLI, the steps are simple and effective. By implementing HA replicas, you ensure that your critical read workloads run smoothly and efficiently, even in the face of potential failures. Please share your feedback and questions by leaving a comment; you can also email us at sqlhsfeedback AT microsoft DOT com. We are eager to hear from you all!

Continue reading...

Enhancing Resilience in Azure SQLDB Hyperscale Named Replicas with High Availability (HA) replica.

Attinder_Pal_Singh