Lesson Learned #455:Optimizing Connection Pooling for Application Workloads: Thread Contention

  • Thread starter Thread starter Jose_Manuel_Jurado
  • Start date Start date
J

Jose_Manuel_Jurado

This week, I have been working on a support case where our client experienced a 5-second delay in database connectivity. They were measuring this using StopWatch. As they informed me, they were launching a series of processes, 50-100 concurrent processes to be precise. I would like to share the experience gained in understanding the reasons behind this 5-second delay.



The first step is to try to understand how these concurrent processes are launched by adding some logs to the calls. Following the code, we observe the following behavior:



Our customer using Connection Timeout in the connectionString the value 5 seconds.



Following the call to the function 'Opening the connection' which calls 'await ConnectWithRetriesAsync(sConnectionStringDummy, threadId)', we then observe that a second later the call to create the connection is initiated, and then 2 seconds later it concludes with several errors of calls exceeding 5 seconds until the connection is established. However, by using the .NET timer and not the connection timer, we see that there is contention among the threads being created. This indicates that establishing the connection is not the issue, and that in addition to what the 50-100 threads may be requesting from the connection pooling, there may also be a delay dependent on the machine's capacity that does not represent the actual time taken to establish the connection.​



Additionally, incorporating a very low connection timeout causes the connection pooling, along with the contention, to induce more blockages at the machine level.​



2023-11-26 19:06:13.978: Thread 4: Opening the connection
2023-11-26 19:06:14.048: Thread 4: Calling
2023-11-26 19:06:14.178: Thread 4: End Calling
2023-11-26 19:06:16.081: Thread 4: Opening the connection
2023-11-26 19:06:16.086: Thread 4: Calling
2023-11-26 19:06:16.087: Thread 4: End Calling
2023-11-26 19:06:16.091: Thread 4: Opening the connection
2023-11-26 19:06:16.093: Thread 4: Calling
2023-11-26 19:06:16.095: Thread 4: End Calling
2023-11-26 19:06:16.098: Thread 4: Opening the connection
2023-11-26 19:06:16.100: Thread 4: Calling
2023-11-26 19:06:16.105: Thread 4: End Calling
2023-11-26 19:06:16.109: Thread 4: Opening the connection
2023-11-26 19:06:16.112: Thread 4: Calling
2023-11-26 19:06:16.122: Thread 4: End Calling
2023-11-26 19:06:16.130: Thread 4: Opening the connection
2023-11-26 19:06:16.133: Thread 4: Calling
2023-11-26 19:06:16.135: Thread 4: End Calling
2023-11-26 19:06:16.144: Thread 4: Opening the connection
2023-11-26 19:06:16.145: Thread 4: Calling
2023-11-26 19:06:16.156: Thread 4: End Calling
2023-11-26 19:06:16.162: Thread 4: Opening the connection
2023-11-26 19:06:16.210: Thread 4: Calling
2023-11-26 19:06:16.222: Thread 4: End Calling
2023-11-26 19:06:16.226: Thread 4: Opening the connection
2023-11-26 19:06:16.229: Thread 4: Calling
2023-11-26 19:06:16.231: Thread 4: End Calling
2023-11-26 19:06:16.238: Thread 4: Opening the connection
2023-11-26 19:06:16.239: Thread 4: Calling
2023-11-26 19:06:16.254: Thread 4: End Calling
2023-11-26 19:06:16.260: Thread 4: Opening the connection
2023-11-26 19:06:16.262: Thread 4: Calling
2023-11-26 19:06:16.264: Thread 4: End Calling
2023-11-26 19:06:16.269: Thread 4: Opening the connection
2023-11-26 19:06:16.273: Thread 4: Calling
2023-11-26 19:06:16.275: Thread 4: End Calling
2023-11-26 19:06:16.286: Thread 4: Opening the connection
2023-11-26 19:06:16.287: Thread 4: Calling
2023-11-26 19:06:16.289: Thread 4: End Calling
2023-11-26 19:06:16.293: Thread 4: Opening the connection
2023-11-26 19:06:16.294: Thread 4: Calling
2023-11-26 19:06:16.296: Thread 4: End Calling
2023-11-26 19:06:16.299: Thread 4: Opening the connection
2023-11-26 19:06:16.301: Thread 4: Calling
2023-11-26 19:06:16.307: Thread 4: End Calling
2023-11-26 19:06:16.311: Thread 4: Opening the connection
2023-11-26 19:06:16.313: Thread 4: Calling
2023-11-26 19:06:16.315: Thread 4: End Calling
2023-11-26 19:06:16.320: Thread 4: Opening the connection
2023-11-26 19:06:16.323: Thread 4: Calling
2023-11-26 19:06:16.325: Thread 4: End Calling
2023-11-26 19:06:16.330: Thread 4: Opening the connection
2023-11-26 19:06:16.332: Thread 4: Calling
2023-11-26 19:06:16.335: Thread 4: End Calling
2023-11-26 19:06:16.339: Thread 4: Opening the connection
2023-11-26 19:06:16.341: Thread 4: Calling
2023-11-26 19:06:16.343: Thread 4: End Calling
2023-11-26 19:06:16.349: Thread 4: Opening the connection
2023-11-26 19:06:16.351: Thread 4: Calling
2023-11-26 19:06:16.354: Thread 4: End Calling
2023-11-26 19:06:16.358: Thread 4: Opening the connection
2023-11-26 19:06:16.360: Thread 4: Calling
2023-11-26 19:06:16.362: Thread 4: End Calling
2023-11-26 19:06:16.365: Thread 4: Opening the connection
2023-11-26 19:06:16.368: Thread 4: Calling
2023-11-26 19:06:16.371: Thread 4: End Calling
2023-11-26 19:06:16.373: Thread 4: Opening the connection
2023-11-26 19:06:16.375: Thread 4: Calling
2023-11-26 19:06:16.375: Thread 4: End Calling
2023-11-26 19:06:16.380: Thread 4: Opening the connection
2023-11-26 19:06:16.382: Thread 4: Calling
2023-11-26 19:06:16.393: Thread 4: End Calling
2023-11-26 19:06:16.436: Thread 4: Opening the connection
2023-11-26 19:06:16.438: Thread 4: Calling
2023-11-26 19:06:16.440: Thread 4: End Calling
2023-11-26 19:06:16.445: Thread 4: Opening the connection
2023-11-26 19:06:16.457: Thread 4: Calling
2023-11-26 19:06:16.458: Thread 4: End Calling
2023-11-26 19:06:31.090: Thread 4: Retry 1 due to A task was canceled.. Will retry in 5,25 seconds.
2023-11-26 19:06:31.173: Thread 4: Retry 1 due to A task was canceled.. Will retry in 5,25 seconds.
2023-11-26 19:06:31.220: Thread 4: Retry 1 due to A task was canceled.. Will retry in 5,25 seconds.
2023-11-26 19:06:31.669: Thread 4: Retry 1 due to A task was canceled.. Will retry in 5,25 seconds.
2023-11-26 19:06:32.494: Thread 4: Retry 1 due to A task was canceled.. Will retry in 5,25 seconds.
2023-11-26 19:06:33.666: Thread 4: Retry 1 due to A task was canceled.. Will retry in 5,25 seconds.
2023-11-26 19:06:35.242: Thread 4: Retry 1 due to A task was canceled.. Will retry in 5,25 seconds.
2023-11-26 19:06:35.605: Thread 4: Retry 1 due to A task was canceled.. Will retry in 5,25 seconds.
2023-11-26 19:06:37.981: Thread 4: Opening the connection
2023-11-26 19:06:37.981: Thread 4: Calling
2023-11-26 19:06:37.982: Thread 4: End Calling
2023-11-26 19:06:37.983: Thread 4: Connected - 00:00:00.00 0ebfd703-c5f1-46f2-b2c7-e989213b251e
2023-11-26 19:06:38.020: Thread 4: Executed the command - 00:00:00.03 - Result: 1



In the same situation, we suggested to increase the command timeout to 30 seconds and let's see the results. We reduced the retry in the threads contention.



2023-11-26 19:26:31.264: Thread 4: Opening the connection
2023-11-26 19:26:31.812: Thread 4: Calling
2023-11-26 19:26:32.237: Thread 4: End Calling
2023-11-26 19:26:33.999: Thread 4: Opening the connection
2023-11-26 19:26:34.035: Thread 4: Calling
2023-11-26 19:26:34.038: Thread 4: End Calling
2023-11-26 19:26:34.043: Thread 4: Opening the connection
2023-11-26 19:26:34.045: Thread 4: Calling
2023-11-26 19:26:34.248: Thread 4: End Calling
2023-11-26 19:26:34.346: Thread 4: Opening the connection
2023-11-26 19:26:34.348: Thread 4: Calling
2023-11-26 19:26:34.351: Thread 4: End Calling
2023-11-26 19:26:34.358: Thread 4: Opening the connection
2023-11-26 19:26:34.361: Thread 4: Calling
2023-11-26 19:26:34.363: Thread 4: End Calling
2023-11-26 19:26:34.368: Thread 4: Opening the connection
2023-11-26 19:26:34.370: Thread 4: Calling
2023-11-26 19:26:34.372: Thread 4: End Calling
2023-11-26 19:26:34.380: Thread 4: Opening the connection
2023-11-26 19:26:34.386: Thread 4: Calling
2023-11-26 19:26:34.387: Thread 4: End Calling
2023-11-26 19:26:34.406: Thread 4: Opening the connection
2023-11-26 19:26:34.409: Thread 4: Calling
2023-11-26 19:26:34.412: Thread 4: End Calling
2023-11-26 19:26:34.417: Thread 4: Opening the connection
2023-11-26 19:26:34.418: Thread 4: Calling
2023-11-26 19:26:34.422: Thread 4: End Calling
2023-11-26 19:26:34.428: Thread 4: Opening the connection
2023-11-26 19:26:34.430: Thread 4: Calling
2023-11-26 19:26:34.433: Thread 4: End Calling
2023-11-26 19:26:34.436: Thread 4: Opening the connection
2023-11-26 19:26:34.453: Thread 4: Calling
2023-11-26 19:26:34.500: Thread 4: End Calling
2023-11-26 19:26:34.528: Thread 4: Opening the connection
2023-11-26 19:26:34.533: Thread 4: Calling
2023-11-26 19:26:34.542: Thread 4: End Calling
2023-11-26 19:26:49.363: Thread 4: 204b6366-03ac-4237-a863-420b2cfef34e ExecutionTime: 208 ConnectionTime:0
2023-11-26 19:26:49.364: Thread 4: Connected - 00:00:17.93 204b6366-03ac-4237-a863-420b2cfef34e
2023-11-26 19:26:49.420: Thread 4: Executed the command - 00:00:00.05 - Result: 1



So, the lessons learned here:




  1. Connection Pool Optimization:
    • Ensure that the connection pool is properly configured. Adjust the maximum and minimum pool size to align with your concurrency needs and the database server's capabilities.

  2. Improving Retry Logic:
    • If you are using retry logic (ConnectWithRetriesAsync), ensure it isn’t contributing to the issue. Consider implementing exponential backoff to space out reconnection attempts and reduce server load.

  3. Limiting Concurrent Connections:
    • Consider implementing logic to limit the number of concurrent connections. This can be done using semaphores or by implementing a task queue.

  4. Profiling and Monitoring:
    • Use profiling and monitoring tools to pinpoint where exactly contention occurs and to better understand your application's behavior.



Measuring with Performance Counters for Technical Blog Articles:



To measure and analyze thread contention and database performance in a C# application, you can use performance counters in Windows. These counters provide valuable insights into various aspects of your application's performance, including threading and database interactions.


  1. .NET CLR LocksAndThreads Counters:
    • Use counters such as # of current logical Threads, # of current physical Threads, # of total recognized Threads, and # of total contention to monitor thread behavior and contention.
    • These counters can help identify high contention scenarios, where threads are frequently waiting to acquire locks.

  2. Using External Profiling Tools.



Code



using System;
using Polly;
using Microsoft.Data.SqlClient;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
using System.IO;
using System.Collections;

namespace HealthCheck
{
class Program
{
const string LogFolder = "c:\\temp\\Mydata";
const string LogFilePath = LogFolder + "\\log";
const string LogExtPath = ".log";
static async Task Main(string[] args)
{
//ClsCheck oClsCheck = new ClsCheck();
//oClsCheck.Main(bSingle:false,bDifferentConnectionString:true);
//DeleteLogFile();
DeleteDirectoryIfExists(LogFolder);
int numberOfThreads = 50000; //Nr Threads
int maxDegreeOfParallelism = 200; //Nr Threads to run concurrent
string connectionString = "data source=tcp:servername.database.windows.net,1433;initial catalog=test;User Id=username@microsoft.com;Password=Pwd!;ConnectRetryCount=3;ConnectRetryInterval=10;Connection Timeout=30;Max Pool Size=100;MultipleActiveResultSets=false;Min Pool Size=0;Pooling=true;Authentication=Active Directory Password;PoolBlockingPeriod=NeverBlock;Connection Lifetime=0";
//ClsEvents.EventCounterListener oClsEvents = new ClsEvents.EventCounterListener();
//ClsEvents.SqlClientListener olistener = new ClsEvents.SqlClientListener();

var semaphore = new SemaphoreSlim(maxDegreeOfParallelism);

var tasks = new Task[numberOfThreads];
for (int i = 0; i < numberOfThreads; i++)
{
tasks = Task.Run(async () =>
{
await semaphore.WaitAsync();
try
{
await ExecuteQueryAsync(connectionString, false);
}
finally
{
semaphore.Release();
}
});
}

await Task.WhenAll(tasks);
}

static async Task ExecuteQueryAsync(string connectionString, Boolean bDifferentConnString=false)
{
int threadId = Thread.CurrentThread.ManagedThreadId;
TimeSpan ts;
string elapsedTime;
string sConnectionStringDummy = "";
try
{
Log($"Thread {threadId}: Opening the connection",threadId);
if (bDifferentConnString)
{
sConnectionStringDummy = sConnectionStringDummy + connectionString + ";Application Name=Testing by JMJD - SQL " + threadId.ToString();
}
else
{
sConnectionStringDummy = sConnectionStringDummy + connectionString + ";Application Name=Testing by JMJD - SQL Unique" ;
}
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
SqlConnection connection = await ConnectWithRetriesAsync(sConnectionStringDummy, threadId);

ts = stopWatch.Elapsed;

elapsedTime = String.Format("{0:00}:{1:00}:{2:00}.{3:00}",
ts.Hours, ts.Minutes, ts.Seconds,
ts.Milliseconds / 10);
Log($"Thread {threadId}: Connected - {elapsedTime} " + connection.ClientConnectionId.ToString(),threadId);


//Log($"Thread {threadId}: Executing the command",threadId);

SqlCommand command = new SqlCommand("SELECT 1", connection);
command.CommandTimeout = 5;

stopWatch.Reset();
stopWatch.Start();

object result = await ExecuteCommandWithRetriesAsync(command);

stopWatch.Stop();
ts = stopWatch.Elapsed;
elapsedTime = String.Format("{0:00}:{1:00}:{2:00}.{3:00}",
ts.Hours, ts.Minutes, ts.Seconds,
ts.Milliseconds / 10);
Log($"Thread {threadId}: Executed the command - {elapsedTime} - Result: {result}", threadId);
//Log($"Thread {threadId}: Closing the connection");
connection.Close();
}
catch (OperationCanceledException canc)
{
Log($"Thread {threadId}: Error (Cancelation): {canc.Message}", threadId);
}
catch (Exception ex)
{
Log($"Thread {threadId}: - Error (Exception): {ex.Message}", threadId);
}
}

static async Task<SqlConnection> ConnectWithRetriesAsync(string connectionString, int threadId)
{
Log($"Thread {threadId}: Calling", threadId);
SqlConnection connection = new SqlConnection(connectionString);
connection.StatisticsEnabled = true;
Log($"Thread {threadId}: End Calling", threadId);

var policy = Policy
.Handle<Exception>()
.WaitAndRetryAsync(5, retryAttempt => TimeSpan.FromSeconds(connection.ConnectionTimeout * 1.05),
(exception, timespan, retryCount, context) =>
{
Log($"Thread {threadId}: Retry {retryCount} due to {exception.Message}. Will retry in {timespan.TotalSeconds} seconds.", threadId);
});

await policy.ExecuteAsync(async () =>
{
try
{
await connection.OpenAsync();
IDictionary currentStatistics = connection.RetrieveStatistics();
if (currentStatistics["ExecutionTime"].ToString() != "0" || currentStatistics["ConnectionTime"].ToString() != "0")
{
Log("Thread "+ threadId.ToString() + ": " + connection.ClientConnectionId.ToString() + " ExecutionTime: " + currentStatistics["ExecutionTime"] + " ConnectionTime:" + currentStatistics["ConnectionTime"], threadId);
}
}
catch (Exception ex)
{
throw;
}
});

return connection;
}

static async Task<object> ExecuteCommandWithRetriesAsync(SqlCommand command)
{
var policy = Policy
.Handle<Exception>()
.WaitAndRetryAsync(5, retryAttempt => TimeSpan.FromSeconds(command.CommandTimeout * 1.05),
(exception, timespan, retryCount, context) =>
{
Log($"Retry {retryCount} due to {exception.Message}. Will retry in {timespan.TotalSeconds} seconds.");
});

object result = null;
await policy.ExecuteAsync(async () =>
{
try
{
result = await command.ExecuteScalarAsync();
}
catch (Exception ex)
{
throw;
}
});

return result;
}

static void Log(string message, int iThread = 0)
{
var ahora = DateTime.Now;
string logMessage = $"{ahora.ToString("yyyy-MM-dd HH:mm:ss.fff")}: {message}";
string sLogFile = LogFolder + "\\log" + iThread.ToString() + LogExtPath;
//Console.WriteLine(logMessage);
try
{
using (FileStream stream = new FileStream(sLogFile, FileMode.Append, FileAccess.Write, FileShare.ReadWrite))
{
using (StreamWriter writer = new StreamWriter(stream))
{
writer.WriteLine(logMessage);
}
}
}
catch (IOException ex)
{
Console.WriteLine($"Error writing in the log file: {ex.Message}");
}
}

public static void DeleteDirectoryIfExists(string path)
{
try
{
if (Directory.Exists(path))
{
Directory.Delete(path, true);
}
Directory.CreateDirectory(path);
}
catch (Exception ex)
{
Console.WriteLine($"Error deleting the folder: {ex.Message}");
}
}
}
}



Enjoy!

Continue reading...
 
Back
Top