Preamble
After you ANALYZE
a PostgreSQL table to collect value distribution statistics, you will find the gathered statistics for each column in the pg_stats
system view. This article will explain the meaning of the correlation
column and its impact on index scans.
Physical vs. logical ordering
Most common PostgreSQL data types have an ordering: they support the operators <
, <=
, =
, >=
and >
.
These data types can be used with the “standard” index type, the B-tree index.
The values in a column of this kind tell you how the table rows should be put in order. An index on this column will be sorted according to that ordering.
A PostgreSQL table is made up of one or more 8KB block files. The physical ordering refers to the arrangement of the rows as they are stored in the file.
You can examine the physical ordering of the rows by selecting the ctid
system column: it contains the block number and the item number inside the block, which describe the physical location of the table row.
The correlation for a column is a value between -1 and 1. It reveals how well logical and physical ordering match each other.
- The rows are stored in the table file in ascending column order if the correlation is 1; otherwise, they are stored in descending order.
- Less ideal matching is denoted by values between -1 and 1.
- There is no relationship between the physical and logical orders when the value is 0.
Why should I care?
On your tables, you’ll create indexes for quicker access.
The effectiveness of an index scan is impacted by a column’s correlation.
A full index or a portion of it is read in index sequential order during an index scan. The table’s corresponding row is retrieved for each entry that is located (this is skipped in an “index only scan,” but that is another story).
The rows that are fetched will come from all over the table if the correlation of the indexed column is close to zero. This will lead to a large number of randomly distributed reads of various table blocks.
The next row fetched during the index scan, however, tends to be in the same or the following table block if the correlation is close to 1 or -1.
High correlation offers two benefits:
- If many of the table rows fetched during the index scan are contained in the same table block, only a small number of blocks need to be read from storage because blocks read by the database are cached in shared memory.
- Due to the proximity of the blocks that need to be read from storage, sequential I/O—which is much faster than random I/O on spinning disks—is enabled.
An example
Let’s make two identical tables with different correlations, but with the same content:
CREATE TABLE corr (id, val) AS SELECT i, 'some text ' || i FROM generate_series(1, 100000) AS i; CREATE INDEX corr_idx ON corr (id); VACUUM (ANALYZE) corr; SELECT correlation FROM pg_stats WHERE tablename = 'corr' AND attname = 'id'; correlation ------------- 1 (1 row) CREATE TABLE uncorr AS SELECT * FROM corr ORDER BY random(); CREATE INDEX uncorr_idx ON uncorr (id); VACUUM (ANALYZE) uncorr; SELECT correlation FROM pg_stats WHERE tablename = 'uncorr' AND attname = 'id'; correlation ------------- -0.00522369 (1 row)
In order to compare index scans on the two tables, we disable bitmap index scans.
Then we assess how index scans function:
SET enable_bitmapscan = off; EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM corr WHERE id BETWEEN 1001 AND 1300; QUERY PLAN --------------------------------------------------- Index Scan using corr_idx on corr (cost=0.29..15.23 rows=297 width=19) (actual time=0.108..0.732 rows=300 loops=1) Index Cond: ((id >= 1001) AND (id <= 1300)) Buffers: shared hit=6 Planning time: 0.456 ms Execution time: 1.049 ms (5 rows) EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM uncorr WHERE id BETWEEN 1001 AND 1300; QUERY PLAN --------------------------------------------------- Index Scan using uncorr_idx on uncorr (cost=0.29..978.15 rows=298 width=19) (actual time=0.105..2.352 rows=300 loops=1) Index Cond: ((id >= 1001) AND (id <= 1300)) Buffers: shared hit=303 Planning time: 0.548 ms Execution time: 2.736 ms (5 rows)
Since all blocks were already in shared buffers, 2.7 milliseconds is actually not that bad.
The 303 blocks from the second query will perform much worse than the 6 from the first if some of these blocks need to be read from disk!
Each result row for the second query was located in a different table block. This resulted in the touching of 300 blocks. Index blocks make up the final three blocks.
Only three table blocks are touched by the first query:
SELECT ctid, id FROM corr WHERE id BETWEEN 1001 AND 1300; ctid | id ---------+------ (6,58) | 1001 (6,59) | 1002 (6,60) | 1003 (6,61) | 1004 (6,62) | 1005 (6,63) | 1006 (6,64) | 1007 ... (8,37) | 1294 (8,38) | 1295 (8,39) | 1296 (8,40) | 1297 (8,41) | 1298 (8,42) | 1299 (8,43) | 1300 (300 rows)
In fact, blocks 6, 7 and 8 of the table contain every row!
Correlation and the optimizer
The PostgreSQL optimizer figures out how much it will cost to run an SQL statement in each way.
Better cost estimates for an index scan can be obtained by using the correlation, which will help you make better plan decisions.
If the correlation is close to or equal to 1, the PostgreSQL optimizer will favor index scans.
Correlation and BRIN indexes
The BRIN index (block range index) was first made available in PostgreSQL 9.5.
This index functions by keeping track of the minimum and maximum values across all ranges of table block values. Only columns with perfect correlation can benefit from it. Its size advantage over the B-tree index makes it an intriguing alternative for large tables.
How to make use of correlation?
Keep the table in index order if you need to efficiently scan larger sections of an index.
In PostgreSQL, there are no “index ordered tables.”
Still, there are two ways to keep a column’s correlation high:
- Automatically:Good examples for that are primary key columns generated by sequences or measurements with a timestamp. If the table rows are inserted in logical column order and there are no updates or deletes on the table, the physical ordering will be identical to the logical ordering.
A BRIN index might be an interesting choice because correlation is always perfect in this scenario.Table partitioning can be used to delete outdated data from a table without affecting the physical ordering.
- Clustering: The SQL statement
CLUSTER
can be used to rewrite a table so that the physical ordering is identical to the logical ordering of an index.However, subsequent modifications of the table will reduce the correlation again. Because of that, you need to re-cluster the table regularly to maintain high correlation. This is annoying, becauseCLUSTER
blocks all concurrent access to the table.
About Enteros
Enteros offers a patented database performance management SaaS platform. It finds the root causes of complex database scalability and performance problems that affect business across a growing number of cloud, RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
Enteros and Cloud FinOps: Driving Database Performance and Observability in the Financial Sector
- 19 February 2025
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Optimizing Cost Estimation in the Tech Sector: How Enteros Leverages Logical Models and Cloud FinOps for Smarter Database Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Enteros and Cloud FinOps: Transforming Database Performance and Observability in the Real Estate Sector
- 18 February 2025
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Optimizing Database Performance for EdTech Firms: How Enteros Enhances RevOps with Advanced Database Performance Software
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…