Preamble
When looking into PostgreSQL performance, it’s important to know how to spot problems and figure out what the server is really doing. So this post is all about finding slow queries in PostgreSQL and identifying performance weak spots.
Performance issues can be approached in a variety of ways. Three techniques, though, have emerged as particularly effective for quickly evaluating issues. Here are my top three tips for dealing with subpar performance:
- Utilize the slow query log.
- executing plans while using auto_explain
- utilizing the pg_stat_statements’ aggregate data
Each method has its own pros and cons, which will be discussed in this paper.
Making use of the PostgreSQL slow query log – method 1
Utilizing PostgreSQL’s slow query log is a more conventional strategy for combating slow queries. A line will be added to the log if a query takes longer than a predetermined amount of time. This makes it easy to find queries that are taking too long, so developers and administrators can act quickly and know where to look.
the slow query log on
The slow query log is not operational in the default configuration. As a result, it must be turned on. You have two options for versions: You can modify postgresql.conf to enable the slow query log globally:
log_min_duration_statement = 5000
If log_min_duration_statement is set to 5000 in postgresql.conf, PostgreSQL will send slow queries—those that take more than 5 seconds—to the logfile. No server restart is required if this line in postgresql.conf is changed. A “reload” will be enough.
postgres=# SELECT pg_reload_conf(); pg_reload_conf ---------------- t (1 row)
You can either directly call the SQL function shown above or use an init script to do this.
Postgresql.conf changes are applied to the entire instance, which might be too much. You need to be much more precise. As a result, it may be sensible to make the change only for a specific user or database.
postgres=# ALTER DATABASE test SET log_min_duration_statement = 5000; ALTER DATABASE
The ALTER DATABASE command allows you to change a single database’s configuration parameter.
Let’s reconnect and make a slow inquiry:
postgres=# \c test You are now connected to database "test" as user "hs." test=# SELECT pg_sleep(10); pg_sleep ---------- (1 row)
In my example, I tell the computer to just wait for 10 seconds using pg_sleep. The desired entry is already there when we look at the log file:
2018-08-20 08:19:28.151 CEST [22845] LOG: duration: 10010.353 ms statement: SELECT pg_sleep(10);
Now you can look at the statement and see why it is slow. This can be done by running “explain analyze,” which will execute the statement and provide you with an execution plan.
The slow query log’s benefits and drawbacks
You can immediately inspect a slow query thanks to the slow query log. When something is slow, you can respond right away to any single query that goes over the desired threshold. However, this strategy’s main strength is also its biggest flaw. Single queries will be tracked in the slow query log. But what if a large number of relatively fast queries are to blame for the poor performance? We can all agree that a query that takes 10 seconds to complete is expensive. What happens, though, if we have 1 million 500-ms-per-query queries running? Since they are still regarded as “fast,” none of those queries will ever appear in the slow query log. However, you might discover backups, CREATE INDEX, bulk loads, and other things. If you solely rely on the slow query log, you might never locate the root cause. In order to identify specific slow statements, the slow query log was created.
Checking unstable execution plans – method 2 to fix slow queries in PostgreSQL
The same is true of our next technique. Your database may occasionally function perfectly, but occasionally a query will act strangely. Finding and fixing those queries is the current objective. Use of the auto_explain module is one way to accomplish that.
The concept is the same as what the slow query log does: Add entries to the log whenever something is slow. In the case of auto_explain, the entire execution plan—not just the query—can be found in the log file. Why is it important? Think about this illustration:
test=# CREATE TABLE t_demo AS SELECT * FROM generate_series(1, 10000000) AS id; SELECT 10000000 test=# CREATE INDEX idx_id ON t_demo (id); CREATE INDEX test=# ANALYZE; ANALYZE
I recently created a table with 10 million rows. An index has also been defined in addition to that. Take a look at these two nearly identical questions:
test = explain SELECT * FROM t_demo WHERE id < 10; QUERY PLAN --------------------------------------------------------------------------- Index Only Scan using idx_id on t_demo (cost: $0.43..$8.61; rows: 10; width: 4) Index condition: (id 10) (2 rows) test = "explain SELECT * FROM t_demo WHERE id < 1000000000; QUERY PLAN ------------------------------------------------------------------ Seq Scan on t_demo (cost = 0.00..169248.60 rows = 10000048 width = 4) Filter: (id < 1000000000) JIT: Functions: 2 Inlining: false Optimization: false (6 rows)
Although the queries are essentially identical, PostgreSQL will employ entirely different execution plans. The first query will only return a few rows, so it will perform an index scan. Since the second query will retrieve all the data, a sequential scan is preferred. Despite how similar the queries seem, the runtime will be entirely different. The first query will execute in a millisecond or so, while the second query might very well take up to half a second or even a second (depending on hardware, load, caching, and all that). The issue now is that while a million queries may be quick due to appropriate parameters, in some rare instances, someone may want something that results in a poor plan or simply returns a lot of data. What can be done? Employ auto_explain.
The best time to use auto_explain is when you find a query that takes too long (for any reason).
Here’s the concept: PostgreSQL has the option to send the plan to the logfile for later review if a query exceeds a predetermined threshold.
Here’s an illustration:
test=# LOAD 'auto_explain'; LOAD test=# SET auto_explain.log_analyze TO on; SET test=# SET auto_explain.log_min_duration TO 500; SET
The LOAD command will load the auto_explain module into a database connection. For the demo, we can do that easily. In a production system, you would use postgresql.conf or ALTER DATABASE or ALTER TABLE to load the module. If you want to make the change in postgresql.conf, consider adding the following line to the config file:
session_preload_libraries = 'auto_explain';
By default, every database connection will have the module loaded thanks to session_preload_libraries. There is no longer a need for the LOAD command. You can try to run the following command after the configuration change has been made (don’t forget to call pg_reload_conf()):
test=# SELECT count(*) FROM t_demo GROUP BY id % 2; count --------- 5000000 5000000 (2 rows)
The query will take longer than 500 milliseconds, and the expected results will appear in the log file:
2018-08-20 09:51:59.056 CEST [23256] LOG: duration: 4280.595 ms plan: Query Text: SELECT count(*) FROM t_demo GROUP BY id % 2; GroupAggregate (cost=1605370.36..1805371.32 rows=10000048 width=12) (actual time=3667.207..4280.582 rows=2 loops=1) Group Key: ((id % 2)) Sort (cost = 1605370.36..1630370.48 rows = 10000048 width = 4) (actual time = 3057.351..3866.446 rows = 10000000 loops = 1) Sort Key: ((id % 2)) Sort Method: External Merge Disk: 137000kB -> Seq Scan on t_demo (cost=0.00..169248.60 rows=10000048 width=4) (actual time=65.470..876.695 rows=10000000 loops=1)
As you can see, the logfile will receive a comprehensive “explain and analyze.”
Advantages and disadvantages of checking unstable execution plans
The advantage of this approach is that you can take a deep look at certain slow queries and see when a query decides on a bad plan. However, it is still hard to gather overall information because there might be millions of queries running on your system.
Checking pg_stat_statements – the 3rd way to fix slow queries in PostgreSQL
The use of pg_stat_statements is the third technique. The goal of Pg_stat_statements is to turn runtime data into a system view by grouping similar queries that only differ in their parameters.
Pg_stat_statements, in my opinion, is really like a Swiss army knife that helps you understand what is actually occurring in your system.
Add the following line to postgresql.conf and restart your server to enable pg_stat_statements:
shared_preload_libraries = 'pg_stat_statements'
In your database, execute “CREATE EXTENSION pg_stat_statements.”
You can use the following table:
test=# CREATE EXTENSION pg_stat_statements; CREATE EXTENSION test=# \d pg_stat_statements View "public.pg_stat_statements" Column | Type | Collation | Nullable | Default ---------------------+------------------+-----------+----------+--------- userid | oid | | | dbid | oid | | | toplevel | bool | | | queryid | bigint | | | query | text | | | plans | bigint | | | total_plan_time | double precision | | | min_plan_time | double precision | | | max_plan_time | double precision | | | mean_plan_time | double precision | | | stddev_plan_time | double precision | | | calls | bigint | | | total_exec_time | double precision | | | min_exec_time | double precision | | | max_exec_time | double precision | | | mean_exec_time | double precision | | | stddev_exec_time | double precision | | | rows | bigint | | | shared_blks_hit | bigint | | | shared_blks_read | bigint | | | shared_blks_dirtied | bigint | | | shared_blks_written | bigint | | | local_blks_hit | bigint | | | local_blks_read | bigint | | | local_blks_dirtied | bigint | | | local_blks_written | bigint | | | temp_blks_read | bigint | | | temp_blks_written | bigint | | | blk_read_time | double precision | | | blk_write_time | double precision | | | wal_records | bigint | | | wal_fpi | bigint | | | wal_bytes | numeric | | |
The view shows you the type of query that has been run and how often it has been run, as well as the total time it took to run and the range of times it took to run.
Utilizing pg_stat_statements when
The good thing about this module is that it lets you find millions of quick queries that might be causing a heavy load. Additionally, pg_stat_statements will provide information on the I/O behavior of different kinds of queries. The downside is that it can be hard to figure out which queries are slow. Most queries are quick, but sometimes they take longer.
Conclusion
If you know when to use each technique, PostgreSQL makes it easy to find slow queries and bottlenecks. This article is meant to give you a quick look at the options you have and the steps you can take to find problems with performance.
About Enteros
Enteros offers a patented database performance management SaaS platform. It finds the root causes of complex database scalability and performance problems that affect business across a growing number of cloud, RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
Enteros, Database Performance, and Generative AI in Cloud FinOps for the Education Sector
- 27 February 2025
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Optimizing Pharmaceutical Operations with Enteros: Enhancing Database Performance and Efficiency with AIOps Platforms
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Enteros for Media & Entertainment: Database Performance, Cloud FinOps, and Observability in a High-Demand Industry
- 26 February 2025
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Enhancing Enterprise Performance in Healthcare: RevOps Strategies and Observability Platforms for Optimized Operations
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…