DISCARDING POSTGRESQL ZHEAP UNDO LOGS

Home > Enteros’ Blog – Thoughts on Database Technology, Machine / Deep learning, and a Generative AI > Database Performance Management > DISCARDING POSTGRESQL ZHEAP UNDO LOGS

Preamble

I’ve already covered a few specifics regarding the PostgreSQL storage engine, zheap. The goal of zheap is to make UPDATE statements run faster so that tables don’t get too big. When an UPDATE is done on a row of data, PostgreSQL now puts two copies of that row in the same data file. While not a harmful tactic, in some circumstances it can result in table bloat. The idea behind zheap undo, which is similar to Oracle, is to handle old copies of rows. The issue was that this “zheap undo” was not cleaned up until now. We created a “discard worker” to take care of the cleanup because of this. The worker minimizes UPDATE-related table bloat by eliminating heaps of UNDO logs.

Getting PostgreSQL and zheap

You must install a PostgreSQL version with zheap support before learning more about what the discard worker does. We developed a container that enables users to test Zip immediately in an effort to make things simpler for everyone.

Zheap was originally intended to be an extension. But unlike other extensions, you can’t simply install it on top of PostgreSQL because it’s too intrusive. zheap does need core patches.

Additionally, I want to emphasize that this is a tech preview rather than a production-ready version. We’re delighted to make it available to the public in order to solicit comments and perhaps even some bug reports.

You can use the free Docker container by clicking here. The installation process is easy and clear-cut:

 [hs@fedora ~]$ docker run --name zHeap -p 5432:5432 \ -d cybertecpostgresql/zheap

Run psql as usual to connect to the newly created instance. Postgres serves as both the username and the default password.

Username: postgres
Password: postgres

 [hs@fedora ~]$ psql -h localhost -p 5432 -U postgres UNDO: Cleaning up using the discard worker

Once PostgreSQL with zheap support is operational, we can look at the process table as follows:

 [hs@fedora ~]$ ps axf | grep post ... 2403084 ? Ss 0:00 \_ /home/hs/pgzheap/bin/postgres -D ./dbzheap 2403086 ? Ss 0:00 \_ postgres: checkpointer 2403087 ? Ss 0:00 \_ postgres: background writer 2403088 ? Ss 0:00 \_ postgres: walwriter 2403089 ? Ss 0:00 \_ postgres: autovacuum launcher 2403090 ? Ss 0:00 \_ postgres: stats collector 2403091 ? Ss 0:00 \_ postgres: undo discard worker 2403093 ? Ss 0:00 \_ postgres: logical replication launcher ...

Automatic launch of the discard worker. Check out its actions and the method used to store the undo log.

A variable needs to be changed before we can create a table and populate it with data.

 test=# SHOW default_table_access_method; default_table_access_method ----------------------------- heap (1 row) test=# SET default_table_access_method TO zheap; SET

When creating a new table, this setting instructs PostgreSQL which storage engine to use by default. We usually want a lot. The preferred option in this instance, though, is zheap. To each CREATE TABLE, we can add the zheap option. You may want to set it, as I just did, for the duration of the session in many circumstances:

 test=# CREATE TABLE t_large (id serial, name text); CREATE TABLE

It doesn’t really matter how the table is set up. Any layout will do.

Testing zheap UNDO with INSERT load

Let’s begin loading data:

 test=# BEGIN; BEGIN test=*# INSERT INTO t_large (name) SELECT 'dummy' FROM generate_series(1, 10000000); INSERT 0 10000000 test=*# \d+ List of relations Schema | Name | Type | Owner | Persistence | Access method | Size | Description --------+----------------+----------+----------+-------------+---------------+------------+------------- public | t_large | table | hs | permanent | zheap | 193 MB | public | t_large_id_seq | sequence | hs | permanent | | 8192 bytes | (2 rows)

The ability of zheap to rollback in the event of an error is crucial. As a result, a great deal of undo must be written: Before we commit the transaction, let’s take a look at the undo directories:

 [hs@fedora dbzheap]$ ls -l ./pg_undo/ && ls -l ./base/undo/ total 8 -rw-------. 1 hs hs 52 12. Nov 11:47 00000001A84012A0 -rw-------. 1 hs hs 52 12. Nov 11:47 00000001C9F48970 total 432128 -rw-------. 1 hs hs 1048576 12. Nov 11:47 000000.0065A00000 -rw-------. 1 hs hs 1048576 12. Nov 11:47 000000.0065B00000 -rw-------. 1 hs hs 1048576 12. Nov 11:47 000000.0065C00000 ... -rw-------. 1 hs hs 1048576 12. Nov 11:47 000000.007FD00000 -rw-------. 1 hs hs 1048576 12. Nov 11:47 000000.007FE00000 -rw-------. 1 hs hs 1048576 12. Nov 11:47 000000.007FF00000

As you can see, we’ve created a fairly large number of files in our base/undo directory. Let’s commit and see what happens:

 test=*# COMMIT; COMMIT

When the new discard worker starts, PostgreSQL will recycle (= delete] these logs. Let’s take a closer look:

 [hs@fedora dbzheap]$ ls -l ./pg_undo/ && ls -l ./base/undo/ total 8 -rw-------. 1 hs hs 52 12. Nov 11:47 00000001A84012A0 -rw-------. 1 hs hs 52 12. Nov 11:47 00000001C9F48970 total 2048 -rw-------. 1 hs hs 1048576 12. Nov 11:47 000000.007FF00000 -rw-------. 1 hs hs 1048576 12. Nov 11:47 000000.0080000000

We anticipated the files to be gone, and they are.

Testing zheap UNDO with UPDATE load

The same will happen when we run UPDATE statements:

 test=# BEGIN; BEGIN test=*# UPDATE t_large SET id = id - 1; UPDATE 10000000

The fact that a lot of WAL has been produced is crucial. As you can see, the file is about 600 MB in size.

 [hs@fedora dbzheap]$ cd base/undo/ [hs@fedora undo]$ du -h 603M

Now, let’s commit the transaction:

 test=*# COMMIT; COMMIT

What follows the commit statement is particularly intriguing:

 [hs@fedora undo]$ date && du -h Fr 12. Nov 11:55:50 CET 2021 603M . [hs@fedora undo]$ date && du -h Fr 12. Nov 11:55:57 CET 2021 2,0M .

The zheap UNDO logs have been deleted, so that’s it. The cleanup is NOT a component of a commit; rather, it is carried out by the discard worker, whose job it is to ensure that the log is not terminated too soon. Keep in mind: You might not be the only transaction at work here, so the cleanup still has to be asynchronous (just like a vacuum is).

About Enteros

Enteros offers a patented database performance management SaaS platform. It proactively identifies root causes of complex business-impacting database scalability and performance issues across a growing number of clouds, RDBMS, NoSQL, and machine learning database platforms.

The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.

Are you interested in writing for Enteros’ Blog? Please send us a pitch!

Optimizing Database Performance and Cloud FinOps in the Insurance Sector with Enteros and AWS CloudFormation

3 April 2025
Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Driving Growth and RevOps Efficiency in the Technology Sector with Enteros

Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Enhancing Healthcare Efficiency with Enteros: AIOps-Driven Database Performance Monitoring

2 April 2025
Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Optimizing Cost Estimation and RevOps Efficiency with Enteros: Enhancing Financial Stability and Balance Sheet Management

Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Preamble

Getting PostgreSQL and zheap

Testing zheap UNDO with INSERT load

Testing zheap UNDO with UPDATE load

About Enteros

RELATED POSTS

Optimizing Database Performance and Cloud FinOps in the Insurance Sector with Enteros and AWS CloudFormation

Driving Growth and RevOps Efficiency in the Technology Sector with Enteros

Enhancing Healthcare Efficiency with Enteros: AIOps-Driven Database Performance Monitoring

Optimizing Cost Estimation and RevOps Efficiency with Enteros: Enhancing Financial Stability and Balance Sheet Management