Preamble
Partitioning PostgreSQL tables is unquestionably one of the most well-liked recent PostgreSQL features. Many people look to partitioned tables as a way to improve performance and broaden scalability. However, partitioning comes with a little problem: how can you partition an existing table without locking up the database? The answer is: pg_rewrite can help you with PostgreSQL table partitioning. Here, you’ll learn how to use pg_rewrite to solve partitioning problems in the most elegant way possible.
Installing pg_rewrite
From our Github profile, you can quickly and easily download pg_rewrite, which is open source. The following listing demonstrates how cloning the repository functions:
hs@fedora src] src] $git clone https://github.com/cybertec-postgresql/pg_rewrite.git Cloning into "pg_rewrite"... remote: Enumerating objects: 22, done. remote: Counting objects: 100% (22/22), done. remote: Compressing objects: 100% (17/17), done remote: Total: 22 (delta 3), reused: 22 (delta 3), pack-reused: 0. Receiving objects: 100% (22/22), 44.51 KiB | 1.78 MiB/s, done. Resolving deltas: 100% (3/3), done.
After completing this, we can access the directory:
[hs@fedora src] src] $ cd pg_rewrite/
To build the code, we have to make sure that the correct version of pg_config
is in the path. I have a binary that works in my home directory, so it’s easy for me to build the code.
[hs@fedora pg_rewrite] pg_rewrite] $ which pg_config ~/pg14/bin/pg_config
If pg_config
is in the path, all you have to do is to run “make install” to compile and install the code:
[hs@fedora pg_rewrite] pg_rewrite] $ make install gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wcast-function-type -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -g -O2 -fPIC -I. -I./ -I/home/hs/pg14/include/postgresql/server -I/home/hs/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o pg_rewrite.o pg_rewrite.c ... /usr/bin/install -c -m 644 concurrent.bc '/home/hs/pg14/lib/postgresql/bitcode'/pg_rewrite/./ cd '/home/hs/pg14/lib/postgresql/bitcode' && /usr/bin/llvm-lto -thinlto -thinlto-action=thinlink -o pg_rewrite.index.bc pg_rewrite/pg_rewrite.bc pg_rewrite/concurrent.bc
The next thing to do is to adjust postgresql.conf
file.pg_rewrite
has to be loaded as a library when the server starts. It won’t function otherwise. Configuring PostgreSQL to load pg_rewrite
works as follows:
Max_replication_slots = 1, or increase the value by 1 if wal_level = logical. shared_preload_libraries = 'pg_rewrite' # ... or add the library to the existing ones.
The wal_level
has to be adjusted to make sure that it contains enough information for logical decoding to work. In addition, you’ll need enough replication slots to run logical decoding safely.pg_rewrite
will need one slot to operate.
After these changes are done, restart your server and verify that the variables are properly set:
test=# SHOW shared_preload_libraries; shared_preload_libraries -------------------------- pg_rewrite (1 row)
Finally, you have to enable the pg_rewrite
extension to ensure that the partition_table
function is available:
test=# CREATE EXTENSION pg_rewrite; CREATE EXTENSION test=# \x Expanded display is on. test=# \df * partition_table* List of functions- [ RECORD 1 ]-------+--------------------------------------------------- Schema | public Name | partition_table Result data type | void Argument data types | src_table text, dst_table text, src_table_new text Type | func
Voilà, you have a working version of pg_rewrite
. Let’s take a look at how we can use it.
Creating a sample table
After installing pg_rewrite
, we can create a table – which we’ll want to partition later on.
test=# CREATE TABLE t_number (x numeric); CREATE TABLE test=# INSERT INTO t_number SELECT random() - 0.5 FROM generate_series(1, 1000000); INSERT 0 1000000
The table includes a few random values for convenience’s sake. Some are higher than zero, while others are lower. Let’s check that
test=# SELECT x < 0, count(*) FROM t_number GROUP BY 1; ?column? | count ----------+-------- f | 499729 t | 500271 (2 rows)
The data appears accurate, so we can proceed to the following step:
test=# ALTER TABLE t_number ADD PRIMARY KEY (x); ALTER TABLE
Only in the presence of a primary key can PostgreSQL recognize a row. If not, there will be issues, and the code won’t run. As a result, it is critical to ensure that primary keys exist.
Creating table partitions in PostgreSQL
Before splitting the table, a partitioning strategy must be created. For our database, we want to employ that strategy. In my example, I just put negative numbers in one table and positive numbers in another. Here’s how it functions:
test=# CREATE TABLE t_part_number (x numeric PRIMARY KEY) PARTITION BY RANGE (x); CREATE TABLE test=# CREATE TABLE t_part_number_neg PARTITION OF t_part_number FOR VALUES FROM (MINVALUE) TO (0); CREATE TABLE test=# CREATE TABLE t_part_number_pos PARTITION OF t_part_number FOR VALUES FROM (0) TO (MAXVALUE); CREATE TABLE
Ensure that the constraints and primary keys are the same. Otherwise, pg_rewrite
will produce an error. Keep in mind that all we want is to redistribute the table; nothing else.
PostgreSQL table partitioning with almost no locking
We can rewrite the data now that all the necessary components are present:
test=# SELECT partition_table ('t_number', 't_part_number', 't_old_number'); partition_table ----------------- (1 row)
t_number
is the source table, which has to be rewritten.t_part_number
is the freshly partitioned table, which we want to use as intermediate storage.t_old_number
is the name of the original table, which will be renamed. The old table will still be there in the event that something goes wrong. The benefit is that nothing could possibly go wrong. The drawback is that additional storage is required. Nevertheless, whether or not the source table is deleted is irrelevant because that was the case during repartitioning. Anyhow, we require twice as much room.
The following will be the final structure:
test=# \d+ List of relations Schema | Name | Type | Owner | Persistence | Access method | Size | Description --------+-------------------+-------------------+-------+-------------+---------------+---------+------------- public | t_number | partitioned table | hs | permanent | | 0 bytes | public | t_old_number | table | hs | permanent | heap | 42 MB | public | t_part_number_neg | table | hs | permanent | heap | 21 MB | public | t_part_number_pos | table | hs | permanent | heap | 21 MB | (4 rows)
As you can see, the original table is still in place, using up 42 MB of storage. The space requirements for our two partitions are the same.
pg_rewriteis a useful technique for creating PostgreSQL partitions from existing tables. It only needs a short lock, which makes it better than the PostgreSQL core, which needs long and extensive table locks.
Finally …
We want to share some good news: We offer ready-made OSM data (OpenStreetMap) for PostgreSQL on our new GIS site. There you will find PostgreSQL dumps that are already made and can be easily put into your system. GIS databases (PostGIS) are frequently very large and can make full use of partitioning.
See this helpful blogpost for details on how to migrate partitioned data.
About Enteros
Enteros offers a patented database performance management SaaS platform. It proactively identifies root causes of complex business-impacting database scalability and performance issues across a growing number of clouds, RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
Optimizing Cost Attribution and Estimation in Healthcare: Enhancing Database Performance Monitoring with Enteros
- 20 February 2025
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Enteros and Cloud FinOps: Driving Database Performance and Observability in the Financial Sector
- 19 February 2025
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Optimizing Cost Estimation in the Tech Sector: How Enteros Leverages Logical Models and Cloud FinOps for Smarter Database Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Enteros and Cloud FinOps: Transforming Database Performance and Observability in the Real Estate Sector
- 18 February 2025
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…