Posted October 27, 20231 yr The SQL Server TDS protocol provides two main paths for query execution (Language and RPC events.) You can trace these events using the Batch::Starting/Completed (Language) and RPC:Starting/Completed (RPC) XEvents. Language events are text streams requiring full parsing and may be susceptible to injection attacks. Language events also require educated guesses. For example, should select 1 return a smallint, bigint? Example: SQLExecDirect(“sp_who 50”) SQL Server receives Type=Language Query=”sp_who 50” SQL Server must parse the query, make an educated guess for the data type of 50 and then execute the command. Injection attack example: An application prompts the user for the session id value and fails to validate the input. Instead, the user enters “50;drop database production” The query submitted to SQL Server is “sp_who 50;drop database production” which is clearly not expected. In fact, malicious users can hide payload using the string null termination hidden character (0x0) “50;0x0drop database production” The SQL language standard allows null termination and other extended characters, but most editors treat them as string terminators. Looking at such a pattern may only show you 50; hiding the malicious payload that SQL Server sees, parses, and executes. RPCs are precisely bundled commands. The term Remote Procedure Call is commonly associated with COM objects or other client server communication protocols. At the high level an RPC provides an Id and marshaled parameters. In SQL Server the Id is a stored procedure name the SQL Server maps to an object id and the parameters are natively streamed. For example: int ID = 50; {call sp_who(?)} ß ID is bound to ‘?’ as an int SQL Server receives Type = RPC ID = sp_who Parameter Count = 1 Parameter[0].Type = int Parameter[0].Value = 50 The RPC Advantages The parameters are sent to the SQL Server in native format removing any guessing. Variable length parameters are sent to SQL Server with a length specifier, read in from the input stream directly, avoiding parsing. The Id requires lookup to match and object id, making it difficult for an attacker to perform TSQL injection. The RPC is faster because all the parsing and guessing is avoided. SQL Server can look up the definition of the procedure to be executed, use the natively bundled parameters, and execute. What is a Packed/Batch RPC The TDS protocol allows multiple RPC requests to be executed in a batch. Packed/Batched RPC Execution foreach insert in OneMillionList command += “{call sp_prepexec …. {Insert} } SQLExecDirect(command) Process results The client and SQL Server send and read the entire command stream into memory and then process the RPCs. SQL Server does not read 1st RPC and process it and then read 2nd RPC and process it, etc. The TDS protocol does not accommodate the read, execute, repeat loop because the client sent ## RPCs in the same batch and is waiting for a response that the entire command was received. Once the entire command is acknowledged by SQL Server the client can process results. The client does not have logic (perhaps outside MARS connections or TSQL cursors) to interleave the initial command stream and results processing. Thus, all million requests need to be read on the server and stored in memory before execution can begin. Non-Packed RPC Execution foreach insert in OneMillionList { SQLExecDirect(“{call sp_prepexec …. {Insert} } -- Each of these is a command submit and requires result processing Process results (SqlFetch, SQLNextResult, etc…) } The packed/batched RPCs require the client to stream everything in a single request, using more memory but reducing the network roundtrips used by Non-Packed (single request) submission loops. The packed/batched RPCs require more memory on the SQL Server than the single submission loops but are faster because SQL can optimize processing the same RPC Id from the packed/batched mode. Not A Bulk Copy Replacement Many documents have confused the packed/batch RPC mode with bulk copy activities. While faster than Non-Packed RPCs, Packed/Batched RPC are not a BULK COPY replacement. Bulk copy sets up in insert stream and inserts the rows as they are streamed to the SQL Server. The rows go onto the data pages and are fully hashed, checkpoint/lazy writer aware where-as the packed RPC must place the parameters in memory and then transfer to the data pages. The memory is not released until the entire packed/batched RPC is complete. sqlcmd Example {call sp_who(1)} {call sp_who(1)} {call sp_who(1)} {call sp_who(1)} go SQL Server receives all 4 RPC execution requests in a single stream from the client. pyodbc Example Any SQL Server client can submit a packed/batched RPC request. Here is an example in pyodbc. ''' use tempdb go drop table if exists t go drop procedure if exists spTest go create table t (id int, id2 int) go create procedure spTest @id int, @id2 int as begin insert into t values (@id, @id2) end go Packed/Batched RPC: fast_executemany = True event_sequence name [TextData] session_id timestamp 223 rpc_starting declare @p1 int set @p1=NULL exec sp_prepare @p1 output,N'@P1 int,@P2 int',N'insert into t(id, id2) values (@P1, @P2)',1 select @p1 68 2023-10-23 08:18:21.9143952 224 rpc_starting exec sp_execute 1,0,0 68 2023-10-23 08:18:21.9165193 225 rpc_starting exec sp_execute 1,1,1 68 2023-10-23 08:18:21.9217097 event_sequence name [TextData] session_id timestamp 230 rpc_starting exec sp_describe_undeclared_parameters N' EXEC spTest @P1,@P2 ' 68 2023-10-23 08:18:21.9351613 231 rpc_starting exec spTest @id=0,@id2=0 68 2023-10-23 08:18:21.9408572 232 rpc_starting exec spTest @id=1,@id2=1 68 2023-10-23 08:18:21.9754044 Singleton RPC: fast_executemany = False event_sequence name [TextData] session_id timestamp 21 rpc_starting declare @p1 int set @p1=NULL exec sp_prepexec @p1 output,N'@P1 int,@P2 int',N'insert into t(id, id2) values (@P1, @P2)',0,0 select @p1 66 2023-10-23 08:20:30.0116869 22 rpc_starting declare @p1 int set @p1=1 exec sp_prepexec @p1 output,N'@P1 int,@P2 int',N'insert into t(id, id2) values (@P1, @P2)',1,1 select @p1 66 2023-10-23 08:20:30.0178314 23 rpc_starting declare @p1 int set @p1=2 exec sp_prepexec @p1 output,N'@P1 int,@P2 int',N'insert into t(id, id2) values (@P1, @P2)',2,2 select @p1 66 2023-10-23 08:20:30.0211486 event_sequence name [TextData] session_id timestamp 27 rpc_starting declare @p1 int set @p1=NULL exec sp_prepexecrpc @p1 output,N'spTest',0,0 select @p1 66 2023-10-23 08:20:30.0718299 28 rpc_starting declare @p1 int set @p1=6 exec sp_prepexecrpc @p1 output,N'spTest',1,1 select @p1 66 2023-10-23 08:20:30.0791241 29 rpc_starting declare @p1 int set @p1=7 exec sp_prepexecrpc @p1 output,N'spTest',2,2 select @p1 66 2023-10-23 08:20:30.0829108 ''' import pyodbc connection_string = ' DRIVER={ODBC Driver 17 for SQL Server};SERVER=localhost;DATABASE=tempdb;UID=sa;PWD=xxxxxxxxxxxxx;' conn = pyodbc.connect(connection_string) conn.autocommit = False conn.execute("set nocount on") ''' cursor = conn.cursor() params = [(i,i) for i in range(5)] cursor.fast_executemany = False cursor.executemany("insert into t(id, id2) values (?, ?)", params) cursor.commit() cursor.close() ''' cursor = conn.cursor() params = [(i,i) for i in range(2000000)] cursor.fast_executemany = True cursor.executemany("{call spTest(?, ?)}", params) cursor.commit() cursor.close() USERSTORE_SXC The memory for handling packed/batched RPCs is associated with the USERSTORE_SXC. select * from sys.dm_os_memory_clerks A packed/batched RPC allocates memory to memory objects (usually MEMOBJ_PROCESSRPC) associated with the USERSTORE_SXC. select * from sys.dm_os_memory_objects Cache and User Stores are monitored by the SQL Server Resource Monitor(s) and when a cache or store grows large or there is memory pressure the cache or store is issued shrink commands. The reported problem when using packed/batched RPCs is that even a dbcc freesystemcache(‘ALL’) execution is unable to shrink the USERSTORE_SXC. This is because the memory is in use by MEMOBJ_PROCESSPMO’s and cannot be released until the packed/batched RPC has fully completed. If your SQL Server experiences spikes in the USERSTORE_SXC and MEMOBJ_PROCESSRPC objects locate the packed/batched RPC requests and ask the client to honor a reasonable batch size limit. Instead of submitting 1 million requests in a single packed/batched RPC submitting 100,000 reduces the memory usage by a factor of 10x. Better yet, if these are insert requests ask the client application to use the Bulk Copy interfaces which reduce memory and increase performance over the packed/batched RPC. SQL Internals - As of Oct 2022 The packed/batched RPCs are consumed by SQL Server in a routine named GetCommandInput. The GetCommandInput loops calling GetNextRPC, consuming the entire request stream. As the individual RPC requests are read from the stream, RpcInfo objects are created using a MEMOBJ_PROCESSRPC memory object. The RpcInfo holds information about the RPC execution such as, the Id, number of parameters, parameter types and values, is the parameter an output parameter, etc. The RpcInfo objects are chained together and passed to the SQL Server execution engine for processing. Note: While the GetCommandInput loop is processing the session_id does NOT appear in sys.dm_exec_requests because the request is not executing yet. This is an aspect that is being looked at for improved visibility. Note: Network performance has a critical impact on the memory consumption timespan on the SQL Server. For example, I executed the pyodbc example and after it consumed significant USERSTORE_SXC I attached the debugger to the python process, stalling the client. SQL Server continued to hold the memory until I resumed the python client. Note: Packet size can be a key factor in network performance. Here is the stack of s stalled client attempting to stream the packed/batched RPCs while SQL Server is trying to load them into memory from the incoming TDS stream. msodbcsql17!AddRpcSprocParameters msodbcsql17!BuildTDSRPCs msodbcsql17!BuildExecSproc msodbcsql17!ExportImp::SQLExecDirectW msodbcsql17!ExportImp::SQLExecute ODBC32!SQLExecute The RpcInfo list is then processed by the SQL Server execution engine and at this point the request appears in sys.dm_exec_requests. Note: While GetCommandInput is reading in the stream the client’s command timeout may not be honored. A TDS client streams the command and then waits for the acknowledgement that the command was fully received. The timeout applies to the wait for the acknowledgement not the streaming of the command. This means a large packed/batched RPC fully streams and allocates memory and then honors the cancel request. (However, a TSQL KILL statement is honored, cancelling the GetCommandInput streaming activity.) Once all the RpcInfo list has been processed the packed/batched RPC sends the final done (SRV_DONEFINAL) to the client and cleans up the RpcInfo list. The cleanup is where the memory from the execution is released, the USERSTORE_SXC memory drops and the MEMOBJ_PROCESSRPC objects are destroyed. Pattern of Packed/Batched RPCs MEMOBJ_PROCESSRPC are commonly limited to 10 memory pages in SQL Server (8192 * 10). GetCommandInput places as many of the RPC requests onto 10-page sized PMOs as possible. Each time a 10-page sized PMO is filled a new MEMOBJ_PROCESSRPC PMO is created and added to the RpcInfo chain. The number of MEMOBJ_PROCESSRPC entries of size (8192 * 10) in sys.dm_os_memory_objects is a sign of ongoing packed/batched RPC activities. Recommendation The client application should limit the size of the packed/batched RPC request to a size that provides limited memory consumption while allowing for the performance advantages. - Bob Dorr Continue reading...
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.