PostgreSQL Automatic Partition Creation

Home > Enteros’ Blog – Thoughts on Database Technology, Machine / Deep learning, and a Generative AI > Deep learning > PostgreSQL Automatic Partition Creation

Preamble

One of the most well-liked features of the more recent PostgreSQL improvements is table partitioning. Automatic partition creation is not yet supported, though. What you can do to fix that is demonstrated in this article.

Use cases for automatic partition creation

In essence, there are only two use cases:

At the end of the current month, create time-triggered partitions, such as those for the following month.
Create new partitions as needed if a row is inserted and it cannot fit in any of the existing ones.

On-demand partitioning and time-triggered partitioning are the two options that will be discussed here.

Automatic partition creation for time-triggered partitioning

At worst, it’s a hassle that PostgreSQL isn’t supported here. The restriction can be overcome in a number of ways:

Making use of the operating system scheduler

You can use the operating system scheduler (cron on Unix-like systems or “Task Scheduler” on Windows). The job would connect to the database using psql and run the appropriate CREATE TABLE and ALTER TABLE ... ATTACH PARTITION statements.

Such a job should typically be scheduled directly on the database server machine. That might not always be possible, like if you are using a hosted database and don’t have access to the operating system. The job can then be scheduled for a different machine in that situation.

using the database’s job scheduler

There is no built-in job scheduling system in PostgreSQL. You can schedule database jobs, though, thanks to extensions like pg_timetable and pg_cron. Adding a third-party extension to your software adds another dependency, and it may not be possible on a hosted database. On the other hand, it will remove your task from the operating system scheduler and make it less likely that it will be forgotten when you move the database.

Using pg_partman

A PostgreSQL extension with a focus on partition management is called pg_partman. Before “declarative partitioning” was added in version 10, it was important because it helped you set up triggers and constraints. Most of these features aren’t needed anymore, but being able to make automatic partitions can still be useful.

Automatic partition creation for on-demand partitioning

This is more challenging, and the pain of PostgreSQL core’s lack of support is more acute here.

The naïve approach to try is to have a BEFORE INSERT trigger that creates and attaches a new partition. But that will result in one of the subsequent “object in use” errors:

 ERROR: cannot CREATE TABLE .. PARTITION OF "tab" because it is being used by active queries in this session

 ERROR: cannot ALTER TABLE "tab" because it is being used by active queries in this session

Essentially, by the time the INSERT has started, you cannot modify its definition any more.

`LISTEN` and `NOTIFY` as a workaround

We need processing to happen at different times, so the new partition must be made in a separate transaction. Since we certainly cannot create the partition before we beginINSERT, it has to be afterwards. However, the partition has to exist at the time of the treatyINSERT, so we would seem to be at an impasse.

But there is an alternative: we can first create a table and then attach it later, as a partition of the partitioned table, rather than creating the partition in a single step. The subsequent series of events could then go as follows:

in the BEFORE INSERT trigger, create a new table that is not yet a partition (skipped if the table already exists)
insert the new row into the prospective partition instead of into the partitioned table
at some later time, attach the newly-created table as a partition

How to execute the attachment “at some later time” is still up for debate. There is a convenient feature in PostgreSQL for asynchronous processing: LISTEN and NOTIFY. A session can register for notifications on a channel (often a table name) with LISTEN and will then be notified asynchronously whenever another backend calls NOTIFY for that channel. The sending backend can add a message string to the notification.

We can use this together with a row level BEFORE INSERT trigger as follows:

create a new table that will become a new partition (if it does not yet exist)
insert the new row into that table
if a new table was created in the first step, NOTIFY with the new partition key as the argument
skip the original INSERT, since we already inserted the row in the partition

Keep in mind that doing this requires a default partition to exist. If there isn’t one, trying to add a row that doesn’t match an existing partition will fail.

Before adding the new table to the partitioned table, a daemon process waits for notifications to come in from a different session.

A sample implementation of automatic partition creation on demand

The definition of a partitioned table is as follows:

PARTITION BY LIST ((ts::date)); CREATE TABLE tab (id bigint GENERATED ALWAYS AS IDENTITY, ts timestamp NOT NULL, data text) CREATE TABLE tab_def PARTITION OF tab DEFAULT;

List partitioning is optional, but I chose to use it to make the listener code simpler.

The initiating action

The trigger appears as follows:

ADD FUNCTION part_trig() TO RETURNS plpgsql trigger language AS $$BEGIN BEGIN /* attempt to create a table for the new partition */ EXECUTE format('CREATE TABLE%I (LIKE TAB INCLUDING INDEXES)', 'tab_' || to_char(NEW.ts, 'YYYY-MM-DD')); /* instruct the listener to attach the partition * (only if a new table was created) */ EXECUTE format('NOTIFY tab,% -- ignore END; /* insert into the new partition */ EXECUTE format( 'INSERT INTO %I VALUES ($1.*)', 'tab_' || to_char(NEW.ts, 'YYYY-MM-DD') ) USING NEW; /* skip insert into the partitioned table */ RETURN NULL; END;$$; CREATE TRIGGER part_trig BEFORE INSERT ON TAB FOR EACH ROW WHEN (pg_trigger_depth() < 1) EXECUTE FUNCTION part_trig();

To understand why the trigger function returnsNULL, read my blog post on the topic. The WHEN clause in the trigger definition avoids infinite recursion; without it, inserting a row into a partition would call the trigger function again.

The listener

We need to use an asynchronous notification-supporting PostgreSQL client API for the listener, so I went with C:

 #include <libpq-fe.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/select.h> #include <errno.h> static int attach_partition(PGconn *conn, char *table, char *datestr) { PGresult *res; char *esc_tab, *esc_part, *esc_date; /* identifiers are at most 63 bytes long */ char stmt[400], part[64]; /* escape table and partition name */ esc_tab = PQescapeIdentifier(conn, table, 63); snprintf(part, 63, "%s_%s", table, datestr); esc_part = PQescapeIdentifier(conn, part, 63); esc_date = PQescapeLiteral(conn, datestr, 10); /* construct statement */ sprintf(stmt, "ALTER TABLE %s ATTACH PARTITION %s FOR VALUES IN (%s)", esc_tab, esc_part, esc_date); free(esc_tab); free(esc_part); free(esc_date); /* execute statement */ if ((res = PQexec(conn, stmt)) == NULL) { fprintf(stderr, "Out of memory sending statement\n"); return 0; } if (PQresultStatus(res) != PGRES_COMMAND_OK) { fprintf(stderr, "Error attaching partition: %s\n", PQresultErrorMessage(res)); return 0; } PQclear(res); return 1; } int main (int argc, char **argv) { PGconn *conn; int sock; PGresult *res; /* connect to PostgreSQL with default parameters */ if ((conn = PQconnectdb("application_name=listener")) == NULL) { fprintf(stderr, "Out of memory connecting to PostgreSQL\n"); return 1; } if (PQstatus(conn) != CONNECTION_OK) { fprintf(stderr, "Error connecting to PostgreSQL: %s\n", PQerrorMessage(conn)); PQfinish(conn); return 1; } /* get network socket for the connection */ if ((sock = PQsocket(conn)) < 0) { fprintf(stderr, "Error getting connection network socket\n"); PQfinish(conn); return 1; } /* listen on a channel */ if ((res = PQexec(conn, "LISTEN tab")) == NULL) { fprintf(stderr, "Out of memory sending LISTEN\n"); PQfinish(conn); return 1; } if (PQresultStatus(res) != PGRES_COMMAND_OK) { fprintf(stderr, "Error listening on channel: %s\n", PQresultErrorMessage(res)); PQfinish(conn); return 1; } PQclear(res); while(1) { fd_set sockets; struct pgNotify *note; /* block waiting for activity on the network socket */ FD_ZERO(&sockets); FD_SET(sock, &sockets); errno = 0; if (select(sock + 1, &sockets, NULL, NULL, NULL) < 0) { perror("Error waiting for notification"); PQfinish(conn); return 1; } /* consume the input */ if (PQconsumeInput(conn) == 0) { fprintf(stderr, "Error receiving data: %s\n", PQerrorMessage(conn)); PQfinish(conn); return 1; } /* check for notifications */ note = PQnotifies(conn); if (note != NULL) { if (!attach_partition(conn, note->relname, note->extra)) { PQfinish(conn); return 1; } PQfreemem(note); } } }

Talking about performance and robustness

Race conditions shouldn’t exist in the aforementioned sample code. However, the performance won’t be stellar. The speed of 100,000 inserts with and without the trigger will be compared.

 test=# TRUNCATE tab; TRUNCATE TABLE test=# \timing Timing is on. test=# INSERT INTO tab (ts, data) SELECT clock_timestamp(), 'something' FROM generate_series(1, 100000); INSERT 0 0 Time: 3354.275 ms (00:03.354) test=# ALTER TABLE tab DISABLE TRIGGER part_trig; ALTER TABLE test=# TRUNCATE tab; TRUNCATE TABLE Time: 20.005 ms test=# INSERT INTO tab (ts, data) SELECT clock_timestamp(), 'something' FROM generate_series(1, 100000); INSERT 0 100000 Time: 120.869 ms

Even though the code runs correctly, the trigger takes about 32 milliseconds per row to process. There are no indexes on the table, so the overall processing time is only 28 times longer than it would be without the trigger. Even though writing the trigger function in C would increase complexity, it would be possible to reduce the overhead.

Conclusion

Even though it is relatively easy to work around the lack of time-triggered partition creation, creating partitions on demand when rows are inserted is more difficult. We found a solution using triggers, LISTEN, and NOTIFY, but the solution was complicated and the performance impact was severe. For this, it would be beneficial to receive better support from the PostgreSQL core.

About Enteros

Enteros offers a patented database performance management SaaS platform. It finds the root causes of complex database scalability and performance problems that affect business across a growing number of cloud, RDBMS, NoSQL, and machine learning database platforms.

The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.

Are you interested in writing for Enteros’ Blog? Please send us a pitch!

Enteros and Cloud FinOps: Transforming Database Performance and Observability in the Real Estate Sector

18 February 2025
Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Optimizing Database Performance for EdTech Firms: How Enteros Enhances RevOps with Advanced Database Performance Software

Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Enteros, Balance Sheet Analysis, Generative AI, and Performance Management in the Insurance Sector

17 February 2025
Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Enhancing Enterprise Performance Management in the IT Sector with Enteros: Leveraging Cloud FinOps and AIOps for Optimal Efficiency

Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Preamble

Use cases for automatic partition creation

Automatic partition creation for time-triggered partitioning

Making use of the operating system scheduler

using the database’s job scheduler

Using pg_partman

Automatic partition creation for on-demand partitioning

LISTEN and NOTIFY as a workaround

A sample implementation of automatic partition creation on demand

The initiating action

The listener

Talking about performance and robustness

Conclusion

About Enteros

RELATED POSTS

Enteros and Cloud FinOps: Transforming Database Performance and Observability in the Real Estate Sector

Optimizing Database Performance for EdTech Firms: How Enteros Enhances RevOps with Advanced Database Performance Software

Enteros, Balance Sheet Analysis, Generative AI, and Performance Management in the Insurance Sector

Enhancing Enterprise Performance Management in the IT Sector with Enteros: Leveraging Cloud FinOps and AIOps for Optimal Efficiency

`LISTEN` and `NOTIFY` as a workaround