About sequences
Artificial numeric primary key columns for tables are created using sequences.
Even if more than one database session is using the sequence at the same time, it still gives a “new ID” that is sure to be different.
Because they are not intended to block the caller, sequences are not transaction safe. That was done on purpose, not because of a flaw.
So, if a transaction asks for a new value from the sequence and then rolls back, there will be a “gap” in the values that have been saved to the database. A sequence is not the best option for you if you actually require a “gap-less” series of values, which is a rare circumstance.
PostgreSQL’s traditional way of using sequences (nextval('my_seq')
) differs from the SQL standard, which uses NEXT VALUE FOR <sequence generator name>
.
New developments in PostgreSQL v10
Personality columns
Version 10 of PostgreSQL introduced the conventional SQL method for defining a table with an automatically generated unique value:
GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY [ ( sequence_options ) ]
Here’s an illustration:
CREATE TABLE my_tab ( id bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY, ... );
This works behind the scenes with a sequence that is roughly equivalent to the conventional
CREATE TABLE my_tab ( id bigserial PRIMARY KEY, ... );
which is a shorthand for
CREATE SEQUENCE my_tab_id_seq; CREATE TABLE my_tab ( id bigint PRIMARY KEY DEFAULT nextval('my_tab_id_seq'::regclass), ... ); ALTER SEQUENCE my_tab_id_seq OWNED BY my_tab.id;
The issue with this type of primary key column is that the generated value is a default value, so it will be overwritten if a user enters a different value.
This is typically not what you want because as soon as the sequence counter reaches the same value, a constraint violation error will occur. Instead, you desire the explicit insertion to be unsuccessful as it was probably executed improperly.
For this you use GENERATED ALWAYS
:
CREATE TABLE my_tab ( id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY, ... );
You can still override the generated value, but you’ll have to use the OVERRIDING SYSTEM VALUE
clause for that, which makes it much harder for such an INSERT
to happen by mistake:
INSERT INTO my_tab (id) OVERRIDING SYSTEM VALUE VALUES (42);
New system catalog pg_sequence
Before PostgreSQL v10, a sequence’s metadata (starting value, increment and others) were stored in the sequence itself.
This information is now stored in a new catalog table pg_sequence
.
The only data that remain in the sequence are the data changed by the sequence manipulation functions nextval
, currval
, lastval
and setval
.
Transactional DDL for sequences
In PostgreSQL, a sequence is a “special table” containing just one row.
In “regular tables,” when a row is updated, a new version is made and the old one is marked as no longer valid. It does not change the existing row. PostgreSQL can be more efficient if it only changes one row of a sequence when its values change. This is because actions in a sequence should be quick and can’t be undone.
Before PostgreSQL v10, all sequence metadata was kept in the sequence (as described in the previous section), but this meant that ALTER SEQUENCE, which also changed a sequence’s single row, could not be undone.
This restriction might be lifted with the newest release, as PostgreSQL v10 gave us pg-sequence and catalog updates are transaction safe in PostgreSQL.
Performance regression with ALTER SEQUENCE
I cheated a little bit when I mentioned that adding a new catalog table made the ALTER SEQUENCE transaction safe. One form of ALTER SEQUENCE that changes the values kept in a sequence is:
ALTER SEQUENCE my_tab_id_seq RESTART;
If only some variants of ALTER SEQUENCE
were transaction safe and others weren’t, this would lead to surprising and buggy behavior.
That problem was fixed with this commit:
commit 3d79013b970d4cc336c06eb77ed526b44308c03e Author: Andres Freund (andres@anarazel.de) Date: Wed May 31 16:39:27 2017 -0700 Make ALTER SEQUENCE, including RESTART, fully transactional. Previously, the changes to the "data" part of the sequence, i.e., the one containing the current value, were not transactional, whereas the definition, including minimum and maximum values, were. That leads to odd behavior if a schema change is rolled back, with the potential that out-of-bound sequence values can be returned. To avoid the issue, create a new relfilenode fork whenever ALTER SEQUENCE is executed, similar to how TRUNCATE and RESTART IDENTITY are already handled. This commit also makes ALTER SEQUENCE RESTART transactional, as it seems to be too confusing to have some forms of ALTER SEQUENCE behave transactionally and some forms not. This way, setval() and nextval() are not transactional, but DDL is, which seems to make sense. This commit also rolls back parts of the changes made in 3d092fe540 and f8dc1985f, as they're now not needed anymore. Author: Andrés Freund Discussion: https://postgr.es/m/20170522154227.nvafbsm62sjpbxvd@alap3.anarazel.de Backpatch: Bug is in master/v10 only
The old data file will now be erased during commit, which means that every ALTER SEQUENCE statement will now create a new one for the sequence. This is similar to how TRUNCATE, CLUSTER, VACUUM (FULL), and a few ALTER TABLE commands are used.
This makes ALTER SEQUENCE in PostgreSQL v10 much slower than in earlier versions. However, you might expect that this statement won’t be used very often, so it shouldn’t hurt performance.
However, Depesz advises using the following method to effectively obtain a block of sequence values without gaps:
create or replace functionality multi_nextval (uuse_seqname text, use_increment integer) RETURNS bigint AS $$ DECLARE reply bigint; BEGIN PERFORM pg_advisory_lock(123); EXECUTE 'ALTER SEQUENCE ' || quote_ident(use_seqname) || ' INCREMENT BY ' || use_increment::text; reply := nextval(use_seqname); EXECUTE 'ALTER SEQUENCE ' || quote_ident(use_seqname) || ' INCREMENT BY 1'; PERFORM pg_advisory_unlock(123); RETURN reply; END; $$ LANGUAGE 'plpgsql';
This function only functions properly when called on an existing sequence; it does not return the first value of a gap-less sequence value block.
Since this function calls ALTER SEQUENCE
not only once but twice, you can imagine that every application that uses it a lot will experience quite a performance hit when upgrading to PostgreSQL v10.
You can do the same thing with the standard sequence manipulation functions, which means you can keep a working version of the function in PostgreSQL v10:
create or replace functionality multi_nextval (uuse_seqname, regclass, use_increment integer) RETURNS bigint AS $$ DECLARE reply bigint; lock_id bigint = use_seqname::bigint; BEGIN PERFORM pg_advisory_lock(lock_id); reply := nextval(use_seqname); PERFORM setval(use_seqname, reply + use_increment - 1, TRUE); PERFORM pg_advisory_unlock(lock_id); RETURN reply + increment - 1; END; $$ LANGUAGE plpgsql;
Use the following syntax to retrieve the sequence value block's first value:RETURN reply;
Keep in mind that only the improved function can be used successfully because both the original and improved functions depend on advisory locks.
About Enteros
Enteros offers a patented database performance management SaaS platform. It finds the root causes of complex database scalability and performance problems that affect business across a growing number of cloud, RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
Enhancing Identity and Access Management in Healthcare with Enteros
- 19 November 2024
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Maximizing Efficiency with Enteros: Revolutionizing Cost Allocation Through a Cloud Center of Excellence
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Driving Efficiency in the Transportation Sector: Enteros’ Cloud FinOps and Database Optimization Solutions
- 18 November 2024
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Empowering Nonprofits with Enteros: Optimizing Cloud Resources Through AIOps Platform
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…