Effective Use of SQL Identity Columns (Oracle & Postgres)
The concept of SQL tables with an auto incremental column for its
primary key it’s obviously not new, as it is easily implemented using
sequences. Now, on December of 2003 the SQL:2003 standard
introduced the identity column type, along with the syntax GENERATED
AS IDENTITY
, which is currently supported by virtually all RDBMS
providers. Let’s take a quick look at how we can declare primary key
columns with auto generated identifiers, without relying on
vendor-specific syntax!
Plain Sequences
This is the most straightforward way to produce primary key values. Sequences generate unique integers and can be incremented by an arbitrary number like 1, 20, -1, etc. Consider the following table:
We could then create a sequence to generate valid data for our id
primary key column:
This approach works fine but assumes some sort of nomenclature is in
place—you know which sequence belongs to the student
table due to
its name alone. Another developer might mistakenly use a different
sequence, an ad hoc value, or even drop the sequence entirely (no
errors will be thrown). Without any corresponding documentation there
really is no way to tell at a glance how id
values are produced,
perhaps they’re managed in a different system altogether.
Attached Sequences
Auto incremental primary key columns are so common that Postgres
offers a shorthand. We could redefine the student
table like
so:
In contrast, Oracle offers no shorthands and requires the declaration of a sequence beforehand, using the NEXTVAL function as the default value for the column:
These last two examples make the table student
depend on a certain
sequence, which can’t be dropped on a whim. The table declaration
itself makes it very clear how the primary key values are generated,
and thus our intentions more evident.
Of course, other RDBMS vendors have their own unique ways of generating unique values (see what I did there?), such as MySQL’s AUTO_INCREMENT attribute, which also works in H2:
It’s important to keep in mind that auto generated clauses do not necessarily create the primary key constraint automatically.
Identity Columns
Identity columns are the ultimate way to declare auto generated
primary key columns, using good ol’ standard SQL (ISO/IEC 9075) and
nothing else. Here’s what the student
table could be rewritten as,
using the identity clause:
Customized Sequence Properties
Because identity columns have an implicit sequence attached to it,
they can be tailored similarly to the CREATE SEQUENCE
statement:
ALWAYS or BY DEFAULT?
The identity column clause has two different flavors: identifiers can
be generated ALWAYS
or BY DEFAULT
. The former instructs the
database to always generate a value, and will actually throw an error
if you do try to insert one yourself:
Ultimately it all comes down to taste. Personally, I find the BY
DEFAULT
behaviour a bit more forgiving (say you need to change an ID
for whatever reason).
ORM Support
Object–relational mapping providers have great support for identity
columns out-of-the-box. In fact, it’s the default for primary key
properties in Entity Framework Core. Here’s how we could map the
student
table to a JPA Entity, using GenerationType.IDENTITY:
Conclusion
Identity columns are a great and universal way to declare primary key columns for our tables. Because it’s part of the SQL standard, this particular syntax works with almost all the major RDBMS vendors, hopefully decreasing the cognitive load necessary when working on multiple systems.