Preamble
Recently, there have been a few posts on Planet Postgresql about “bloat” that are both extremely detailed and highly technical. Bloat is a necessary side effect of Postgres’s excellent multi-version concurrency control (MVCC) model, which makes new copies of rows when data is changed in a situation where multiple users are accessing the database at the same time. Your tables (and indexes) will eventually use more disk space than they should, especially if you run busy multi-user applications where “autovacuum” occasionally cannot keep up with the rate of data changes.
But ok, let’s skip over the more in-depth bloat and bloat fighting details this time and focus on how to actually quickly determine if you do in fact have a “bloat problem” on one of your tables (VACUUM FULL, CLUSTER, pg_repack).
Estimation approaches
There are 2 most common approaches here: to use a Postgres extension called “pgstattuple” (part of the official contrib modules), which includes a function named pgstattuple_approx() available since Postgres version 9.5, or to rely on a community-developed pure SQL query where no setup is needed and which works also for older versions. You can find community scripts on the Postgresql Wiki bloat page. There are a few scripts that differ slightly. Both methods of estimating are very cheap and involve making some educated guesses based on statistics. The “pgstattuple” extension also has a pgstattuple() method, which we will include in our comparisons to get accurate information on bloat, but this method may be expensive because it does a full table scan.
So let’s move on and try to use our different estimation methods to show how bloat happens and how it can be found.
Generating bloat artificially
The first important question is: How can we make bloat happen on purpose for our tests? Evidently, it’s not a simple matter.
We’ll use the handy command-line performance testing tool “pgbench” that comes with standard Postgres installation to generate test data (UPDATE statements). The tricky part is that we need to make sure that old versions of updated rows aren’t cleaned up and marked for re-use during our test; otherwise, we won’t get the desired bloat. To make the transaction truly active, I believe the simplest solution is to start a new parallel transaction in REPEATABLE READ mode and then run a data reading query in that session. You could also turn off autovacuum in postgresql.conf or on a per-table basis with “ALTER TABLE SET (autovacuum_enabled = false)”. This makes sure that all visible rows will stay until our “repeatable” transaction is finished. This approach would also prevent any potential instances of HOT pruning (on-the-fly page level mini-vacuuming) for updated fields without indexes.
First, we’ll set up the test schema with a scaling factor of 10, which means that 1 million bank accounts will be initialized in our chosen test table called “pgbench_accounts,” and then 1 million account updates will be run, which should double the table size and create a bloat ratio of 50%.
Running the test
krl@r840:~$ pgbench -i -s10 pgbench creating tables: "10000 of 100000 tuples (10%) done (elapsed 0.04 s, remaining 0.32 s)"; "100000 of 100000 tuples (100%) done (elapsed 0.44 s, remaining 0.00 s)"; "Set primary keys"; "Done" pgbench=# \dt+ List of relations Schema | Name | Type | Owner | Size | Description --------+------------------+-------+--------+---------+------------- public | pgbench_accounts | table | krl | 128 MB | public | pgbench_branches | table | krl | 40 kB | public | pgbench_history | table | krl | 0 bytes | public | pgbench_tellers | table | krl | 40 kB | (4 rows)
Let’s run our “vacuum stopper” query in a separate session now, then begin the data update procedure.
krl@pgbench=# begin transaction isolation level repeatable read ; BEGIN krl@pgbench=#* select * from pgbench_accounts limit 1; aid │ bid │ abalance │ filler ─────┼─────┼──────────┼────────────────────────────────────────────────────────────────────────────────────── 3 │ 1 │ 0 │
Let’s now begin our test for bloat generation. To achieve the best possible performance, I’m using the “prepared transaction” protocol here. Take note that the “-transactions” flag is per client.
krl@r840:~$ pgbench --no-vacuum --client=2 --jobs=2 --transactions=500000 --protocol=prepared pgbench transaction type: TPC-B (sort of) scaling factor: 10; query mode: prepared number of clients: 2 number of threads; 2 number of transactions per client: 500000 Number of transactions actually processed: 1000000/1000000 Latency average: 0.000 ms tps = 1903.859294 (including connections establishing) tps = 1904.357546 (excluding connections establishing) krl@pgbench=# \dt+ List of relations Schema │ Name │ Type │ Owner │ Size │ Description ────────┼──────────────────┼───────┼───────┼────────┼───────────── public │ pgbench_accounts │ table │ krl │ 256 MB │ public │ pgbench_branches │ table │ krl │ 35 MB │ public │ pgbench_history │ table │ krl │ 50 MB │ public │ pgbench_tellers │ table │ krl │ 42 MB │ (4 rows)
Completed…and it appears that the size of our test table, “pgbench_accounts,” has indeed doubled.
Installing the “pgstattuple” extension will allow us to attempt to obtain some estimates.
We also need to end our parallel session and run VACUUM ANALYZE to update the internal statistics and get rid of old versions of rows, which will help us make educated guesses.
krl@pgbench=#* rollback ; ROLLBACK krl@pgbench=# VACUUM ANALYZE pgbench_accounts; VACUUM krl@pgbench=# CREATE EXTENSION pgstattuple ; CREATE EXTENSION
Let’s now examine the estimate figures. In order to cut down on noise, I’ve only chosen the most important columns here and also added our table name to the WHERE clause of the community query.
Note: In the case of pgstattuple_approx(), a scanned_percent value of 0 means that we do have an estimated value. This means that this function can still do table scans even when things aren’t going well.
krl@pgbench=# select table_len, scanned_percent, approx_free_space, and approx_free_percent from pgstattuple_approx('pgbench_accounts'); table_len scanned_percent approx_free_space approx_free_percent ───────────┼─────────────────┼───────────────────┼───────────────────── 268607488 │ 0 │ 131164800 │ 48.8314011558754 (1 row) krl@pgbench=# i /temp/bloat_investigation/table_bloat_ioguix.sql real_size extra_size extra_ratio ───────────┼────────────┼────────────────── 268607488 │ 136126464 │ 50.6785812315106 (1 row) pgbench=# select table_len, free_space, and free_percent from pgstattuple('pgbench_accounts'); table_len, free_space, and free_percent ───────────┼────────────┼────────────── 268607488 │ 131689396 │ 49.03 (1 row)
Making sense of the results
The actual number and the estimates are now available. We can see that the estimates from pgstattuple_approx (approx_free_space = 131164800) and pure SQL (extra_size = 136126464) are very close to the actual value of 131689396 (pgstattuple.free_space), with differences of 0.5% and 3%.
In order to wrap up, I’ll say that estimation methods for simple tables with mostly fixed data-type columns work surprisingly well and can actually be used to build a bloat removal check or alert.
About Enteros
Enteros offers a patented database performance management SaaS platform. It finds the root causes of complex database scalability and performance problems that affect business across a growing number of cloud, RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
Enteros and CloudTech: Database Performance and RevOps in the BFSI Sector
- 20 February 2025
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Optimizing Cost Attribution and Estimation in Healthcare: Enhancing Database Performance Monitoring with Enteros
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Enteros and Cloud FinOps: Driving Database Performance and Observability in the Financial Sector
- 19 February 2025
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Optimizing Cost Estimation in the Tech Sector: How Enteros Leverages Logical Models and Cloud FinOps for Smarter Database Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…