Preamble
The idea of checkpoints is fundamental to PostgreSQL. However, many people don’t know what they actually are, nor do they understand how to tune checkpoints to reach maximum efficiency. This article will discuss checkpoints and checkpoint tuning in an effort to clarify these crucial database internals.
How PostgreSQL writes data
Before delving deeper into checkpoints, it’s critical to comprehend how PostgreSQL writes data.
The most crucial aspect is that we must believe a crash could occur at any time. Why does that matter? Well, we want to make sure that your database is uncorruptible at all times. As a result, we are unable to directly write data to the data files. How come? Let’s say we desired to add the value “1234” to the data file. What if our computer died after “12”? A broken row somewhere in the table would be the outcome. At all costs, we must avoid missing index entries or any other type of information.
Consequently, a more sophisticated method of writing data is required. How does it function? PostgreSQL sends data to the WAL (Write Ahead Log) as soon as possible. A paranoid sequential tape with binary changes is exactly how the WAL is. If a row is added, the WAL might have a record that says a row in a data file needs to be changed, it might have a few instructions to change the index entries, it might have to write an extra block, and so on. It merely consists of a series of changes.
PostgreSQL will modify the cached version of a block in shared buffers after data is sent to the WAL. It should be noted that data is still missing from the data files. WAL entries as well as updates to the shared buffers are now available. The data is found in the cache, so even if a read request is received, it won’t reach the data files.
Those altered in-memory pages are eventually written to disk by the background writer. The fact that data may be written out of order is what’s important here; it’s not a problem. Remember that PostgreSQL will check shared buffers before requesting the block from the operating system if a user requests to read data. As a result, the order in which dirty blocks are written is unimportant. To increase the likelihood of writing many changes to disk in a single I/O request, it is even advisable to write data a little later.
Getting rid of WAL
However, we are limited in how long we can write data to the WAL. Space must eventually be recycled.
In order to allow for the recycling of the WAL up to that point, checkpoints are used to make sure that all dirty buffers created up to a certain point are written to disk. PostgreSQL accomplishes this by starting a checkpointer process that logs the missing updates to the disk. The speed at which this process sends data to disk is, however, not optimal. The I/O curve needs to be flattened because we want to ensure consistent response times.
The parameter to manage checkpoint spread is…
test=# SHOW checkpoint_completion_target; checkpoint_completion_target ------------------------------ 0.5 (1 row)
The concept is that a checkpoint is completed in half before the following checkpoint is anticipated to begin. In practice, a value between 0.7 and 0.9 seems to be the best option for the majority of workloads, but feel free to play around with it a little.
NOTE: It’s likely that this parameter won’t exist in PostgreSQL 14 any longer. Because 0.9 will be the hardcoded value, end users will find it simpler.
The following crucial inquiry is: When does a checkpoint actually begin? There are some controls for this behavior, including:
test=# SHOW checkpoint_timeout; checkpoint_timeout -------------------- 5min (1 row)
test=# SHOW max_wal_size; max_wal_size -------------- 1GB (1 row)
Checkpoints take place after a predetermined amount of time if the load on your system is low. The 5 minute setting is the default. To improve write performance, we do advise raising this value.
PLEASE NOTE: Feel free to modify these numbers. They have an impact on performance but in no way put your data in danger. No data, besides performance, will be in danger.
It’s a little trickier to use max_wal_size because it’s a soft limit rather than a strict one. Thus, be ready. The WAL may be greater than this sum. The purpose of the parameter is to inform PostgreSQL of the potential amount of WAL accumulation so that checkpoints can be modified accordingly. According to the general rule, increasing this value will result in a higher space consumption rate while simultaneously enhancing write performance.
So why not just set max_wal_size to infinity? The first justification is plain to see: you’ll need a lot of room. There is more, though: PostgreSQL must redo all changes made since the last checkpoint in the event that your database crashes. In other words, because of the absurdly high amounts of WAL that have accumulated since the last checkpoint, your database may take longer to recover after a crash. Although there is a limit to what can be done and accomplished, performance does improve if checkpoint distances are increased. At some point, adding more storage won’t help with the issue any longer.
Some dirty blocks are being written to the disk by the background writer. However, the checkpoint process itself often performs a great deal more work. So, it makes sense to put more effort into checkpointing than into making the background writer work better.
min_wal_size: The mysterious parameter
The definitions of min_wal_size and max_wal_size are frequently questioned by individuals (from training sessions or PostgreSQL 24×7 support clients). There is a lot of misunderstanding about these two parameters. Let me attempt to explain what is happening. PostgreSQL changes its checkpoint distances on its own, as was previously mentioned. The WAL is maintained below max_wal_size. PostgreSQL will, however, gradually decrease the amount of WAL once more until it reaches min_wal_size if your system is idle. This is not an instantaneous process; it takes place gradually and over a long period of time.
To illustrate the situation, let’s use a straightforward scenario. Consider a system that sees a lot of write activity during the week but is idle on the weekends. As a result, the WAL is quite large on Friday afternoon. PostgreSQL will, however, gradually shrink the WAL over the course of the weekend. Those missing WAL files will be recreated on Monday when the workload increases, which may cause performance issues.
In order to avoid having to create new WAL files when the load increases again, it might be a good idea to keep min_wal_size from being set too low (in comparison to max_wal_size).
Finally …
The subject of checkpoints is crucial to achieving good performance.
About Enteros
Enteros offers a patented database performance management SaaS platform. It finds the root causes of complex database scalability and performance problems that affect business across a growing number of cloud, RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
Enteros, Database Performance, and Generative AI in Cloud FinOps for the Education Sector
- 27 February 2025
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Optimizing Pharmaceutical Operations with Enteros: Enhancing Database Performance and Efficiency with AIOps Platforms
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Enteros for Media & Entertainment: Database Performance, Cloud FinOps, and Observability in a High-Demand Industry
- 26 February 2025
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Enhancing Enterprise Performance in Healthcare: RevOps Strategies and Observability Platforms for Optimized Operations
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…