Serializable Snapshot Isolation - Part 1
2024-12-10
title: ‘Serializable Snapshot Isolation - Part 1’ date: 2024-12-10T05:33:08Z tags: - Database
summary: description: How Serializable Snapshot Isolation works and is implemented in PostgreSQL.
Snapshot Isolation
Let us recap how Snapshot Isolations works first.
The concept of SI is quite simple: A transaction only sees data that has been committed before it starts. Even if a newer version of data has been committed(updated) by another transaction, it can not see this update unless the update is done by this transaction itself. Upon commit, it will abort if any conflict is detected.
Internal Tuple Fields
Every time a transaction begins, a unique uint_32
monotonic increasing* transaction ID (txid) is assigned to it.
Tuples (table rows) have some extra fields (columns) that are related to txid and normally not visible to the user. Here are the fields that PostgreSQL used for SI:
t_min
: the transaction ID (txid) of the transaction that inserted this tuple.t_max
: the transaction that deleted or updated this tuple. 0 means this tuple has not been deleted or updated yet.t_cid
: is the number of the commands executed within the current transaction, starting from 0.t_ctid
: current tuple identifier, a pointer points to itself or its update.
Next, we examine how t_min
, t_max
, t_cid
, and t_ctid
might change during typical CRUD operations in PostgreSQL.
Initial Setup
Let’s assume we start with a simple table:
CREATE TABLE example (
id SERIAL PRIMARY KEY,
value INTEGER
);
Initial state (no rows):
ctid |
value |
xmin |
xmax |
t_ctid |
Notes |
---|---|---|---|---|---|
(none) | (none) | (none) | (none) | (none) | Table is empty. |
Step 1: Insert (Create)
INSERT INTO example (value) VALUES (10);
- Action: A new row is inserted.
- Effect:
xmin
is set to the transaction ID of the insert,xmax
isNULL
(indicating the row is valid), andt_ctid
isNULL
(no new version).
ctid |
value |
xmin |
xmax |
t_ctid |
Notes |
---|---|---|---|---|---|
(0,1) |
10 |
100 |
NULL |
NULL |
Row inserted by xid=100 . |
Step 2: Read
SELECT * FROM example WHERE value = 10;
- Action: The row is read.
- Effect: No change to the internal metadata (
xmin
,xmax
,t_ctid
) because reads do not modify the row. The visibility check will be discussed later.
ctid |
value |
xmin |
xmax |
t_ctid |
Notes |
---|---|---|---|---|---|
(0,1) |
10 |
100 |
NULL |
NULL |
Row read. No changes. |
Step 3: Update
UPDATE example SET value = 20 WHERE value = 10;
- Action: The row is updated. PostgreSQL creates a new row version for the update, links the old row to the new row via
t_ctid
, and setsxmax
for the old row to the current transaction ID. The new row’sxmin
is the current transaction ID, andt_ctid
isNULL
for the new row.
ctid |
value |
xmin |
xmax |
t_ctid |
Notes |
---|---|---|---|---|---|
(0,1) |
10 |
100 |
200 |
(0,2) |
Old version marked deleted (xid=200 ). |
(0,2) |
20 |
200 |
NULL |
NULL |
New version inserted (xid=200 ). |
Step 4: Delete
DELETE FROM example WHERE value = 20;
- Action: The row is deleted. PostgreSQL marks the row as invalid by setting its
xmax
to the current transaction ID (300
). No new version is created, sot_ctid
remainsNULL
.
ctid |
value |
xmin |
xmax |
t_ctid |
Notes |
---|---|---|---|---|---|
(0,1) |
10 |
100 |
200 |
(0,2) |
Old version remains unchanged. |
(0,2) |
20 |
200 |
300 |
NULL |
Row marked deleted (xid=300 ). |
Behavior Recap
- Insert: Creates a new row with
xmin
setxmax
andt_ctid
asNULL
. - Read: Does not modify
xmin
,xmax
, ort_ctid
. - Update: Marks the old version with
xmax
and creates a new version linked viat_ctid
. - Delete: Marks the row as invalid by setting
xmax
.
Visibility Check
After we know how CRUD modified these fields, let’s examine how a transaction determines whether it can see a tuple.
Commit Log:
PostgreSQL uses Commit Log (clog) to keep transaction statuses. Think of clog as an array of states, and the index is the txid. For example, transaction 11 is running, so clog[11] == IN_PROGRESS
, and when it’s finished, it will set clog[11] = COMMITTED
or clog[11] = ABORTED
.
Snapshot:
In PostgreSQL, upon transaction begin, it gets a snapshot. Unless the isolation level is READ COMMITTED, the transaction will get a snapshot upon every SQL execution. The version number of a snapshot looks like xmin:xmax
. Every txid less than xmin is done, so-called inactive, either committed or aborted; those committed are visible. Every txid greater than xmax is not assigned yet, considered active and thus invisible. txid between xmin and xmax could be done or in progress, this is the small range you will need extra check.
Rules:
if status(t_xmin) == ABORTED: INVISIBLE;
RULE 1: It is obvious that the tuple inserted by an aborted transaction is invisible.if status(t_xmin) == IN_PROGRESS:
if t_xmin != curr_txid: INVISIBLE;
RULE 2: This tuple is inserted by another tx and is in progress, thus invisible.if t_xmin == curr_txid && t_xmax != 0: INVISIBLE;
RULE 3: This tuple has been updated by this transaction, thus invisible.if t_xmin == curr_txid && t_xmax == 0: VISIBLE;
RULE 4: This tuple is the newest version inserted by this transaction itself, thus visible.
if status(t_xmin) == COMMITTED:
if snapshot_status(t_xmin) == ACTIVE: INVISIBLE;
RULE 5: Even if this tuple is inserted by a committed transaction, it’s considered active when taking the snapshot, thus invisible.elif t_xmax == INVALID || status(t_xmax) == ABORTED: VISIBLE
RULE 6: The tuple is inserted by a committed transaction, and not been updated or the transaction that updated this tuple aborted, thus visible.elif status(t_xmax) == IN_PROGRESS:
if t_xmax == curr_txid: INVISIBLE;
RULE 7: This tuple has a newer version insterted by this transaction and thus invisible.else: VISIBLE;
RULE 8: Here t_xmin is committed and t_xmin is in progress, it is still visible.
elif status(t_xmax) == COMMITTED:
if snapshot_status(t_xmax) == ACTIVE: VISIBLE;
RULE 9: Similar to 3.1, t_xmax is considered active when taking the snapshot, and thus visible.else: INVISIBLE;
RULE 10: Both t_xmin and t_max are committed here, means this tuple is already updated or deleted by a committed transaction, and thus invisible.
Put these together will be something like:
bool is_tuple_visible(Tuple *tuple, TransactionId curr_txid, Snapshot *snapshot)
{
if (status(tuple->t_xmin) == ABORTED)
return false;
if (status(tuple->t_xmin) == IN_PROGRESS) {
if (tuple->t_xmin != curr_txid)
return false;
return (tuple->t_xmax == 0);
}
if (status(tuple->t_xmin) == COMMITTED) {
if (snapshot_status(tuple->t_xmin) == ACTIVE)
return false;
if (tuple->t_xmax == INVALID || status(tuple->t_xmax) == ABORTED)
return true;
if (status(tuple->t_xmax) == IN_PROGRESS)
return (tuple->t_xmax != curr_txid);
if (status(tuple->t_xmax) == COMMITTED)
return (snapshot_status(tuple->t_xmax) == ACTIVE);
}
return false; // Default case, should not reach here
}
These rules covered how visibility checks are applied in PostgreSQL in most cases, except for subtransactions. You might wonder how different transactions under different isolation levels apply these rules. Well, it is quite simple:
- READ UNCOMMITTED: Not implemented. Choosing this is the same as choosing READ COMMITTED.
- READ COMMITTED: Get a new snapshot on every command.
- REPEATABLE READ: Only get a snapshot at the beginning of the transaction.
- SERIALIZABLE: Same as REPEATABLE READ, but apply additional rules to check if the transaction violates constraints.
In the next part, we will see how SSI works.