Jump to content

Efficient Partitioning of Large Tables in PostgreSQL and SQL Server using the First Letter


Recommended Posts

Guest fasaf
Posted

Partitioning by First Letter:

 

When partitioning by first letter, we can either create a computed column on the first letter or use a range partition. However, the most elegant solution is to use only the letter itself, as it follows a predictable order regardless of the length of the key field.

 

 

 

As a bonus, I’m showing here how you can load the data in parallel to different letter-based tables and then switch in those tables to unified tables. This will enable you to load data very fast, more details in another blog post that I wrote (Efficiently Generating and Loading 1 Billion Rows into a Relational Database Content in just an hour - Microsoft Community Hub).

 

 

 

PostgreSQL Solution:

 

To implement partitioning by first letter in PostgreSQL, we can use the CREATE TABLE command with the PARTITION BY clause. Here's an example code snippet to create a table partitioned by first letter:

 

CREATE TABLE
QR_A (a
VARCHAR
(10)
NOT NULL
, b
INT
NOT NULL
);

 

CREATE TABLE
QR_B (a
VARCHAR
(10)
NOT NULL
, b
INT
NOT NULL
);

 

ALTER TABLE
QR_A
ADD CONSTRAINT
StartWithA

 

CHECK
( a >= 'a'
AND
a < 'b' );

 

ALTER TABLE
QR_B
ADD CONSTRAINT
StartWithB

 

CHECK
( a >= 'b'
AND
a < 'c' );

 

INSERT INTO
QR_A
values
('a1',1);

 

INSERT INTO
QR_B
values
('b1',1);

 

Next, we can create the full partitioned table:

 

CREATE TABLE
QR(a
VARCHAR
(10)
NOT NULL
, b
INT
NOT NULL
)

 

PARTITION BY RANGE
(a);

 

Last, we will attach the letter-based tables into the full table.

 

ALTER TABLE
QR
ATTACH PARTITION
QR_A

 

FOR VALUES FROM
('a')
TO
('b');

 

ALTER TABLE
QR
ATTACH PARTITION
QR_B

 

FOR VALUES FROM
('b')
TO
('c');

 

 

 

SQL Server Solution:

 

To implement partitioning by first letter in SQL Server, we can use the CREATE PARTITION FUNCTION and CREATE PARTITION SCHEME commands. Here's an example code snippet to create a partition function and scheme for a table partitioned by first letter:

 

CREATE PARTITION FUNCTION
qr_pf (varchar(12))

 

AS RANGE
RIGHT
FOR VALUES
('a', 'b', 'c');

 

CREATE PARTITION SCHEME
qr_ps

 

AS PARTITION
pfSalesRight

 

ALL
TO
([Primary]);

 

Then, we are creating and populating the letter-based tables:

 

CREATE TABLE
dbo.QR_A (a
VARCHAR
(12), b
INT
) ;

 

CREATE TABLE
dbo.QR_B (a
VARCHAR
(12), b
INT
) ;

 

ALTER TABLE
dbo.QR_A

 

WITH CHECK ADD CONSTRAINT
ckLetterA

 

CHECK
(a IS NOT NULL
AND
a >= 'a' and a < 'b');

 

ALTER TABLE
dbo.QR_B

 

WITH CHECK ADD CONSTRAINT
ckLetterB

 

CHECK
(a IS NOT NULL
AND
a >= 'b'
AND
a < 'c');

 

INSERT
QR_A
values
('a',1),('ab',1);

 

INSERT
QR_B
values
('ba',1),('b',1);

 

Finally, we can create the partitioned table and switch-in the letter-based tables with the following code snippet:

 

CREATE TABLE
dbo.QR (

 

a
VARCHAR
(12),

 

b
INT

 

)
ON
qr_ps(a);

 

ALTER TABLE
QR_A
SWITCH TO
QR
PARTITION
2;

 

ALTER TABLE
QR_B
SWITCH TO
QR
PARTITION
3;

 

 

 

Conclusion:

 

Partitioning large tables based on the first letter of the key field is an effective way to improve query performance and simplify index maintenance. By using the appropriate commands in PostgreSQL or SQL Server, we can easily create a partitioned table that is optimized for our use case.

 

Continue reading...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...