Isolation levels
View as MarkdownAn isolation level determines which effects of concurrent transactions are visible to a transaction during its execution.
Supported isolation levels
Materialize accepts the following isolation levels:
| Isolation level | Behavior in Materialize |
|---|---|
| Strict Serializable | Default. Provides serializability and linearizability. |
| Serializable | Provides serializability but not linearizability. |
Bounded Staleness <duration> |
Public preview. Serves reads at a timestamp at most <duration> stale; never blocks, errors if the bound cannot be met. |
| Read Uncommitted, Read Committed, Repeatable Read | Accepted for compatibility; treated as Serializable. |
Serializable
Serializable prevents the following three phenomena1:
| Phenomenon | Description |
|---|---|
| P1 (Dirty read) | A transaction T1 modifies a row; another transaction T2 reads the row before T1 commits. If T1 rolls back, T2 has read a row that was never committed. |
| P2 (Non-repeatable read) | A transaction T1 reads a row; another transaction T2 modifies or deletes that row and commits. If T1 attempts to reread the row, it may see the modified value or discover that the row no longer exists. |
| P3 (Phantom) | A transaction T1 reads a set of rows that match a specific search condition; another transaction T2 inserts rows that also match the condition and commits. If T1 repeats the read with the same search condition, it gets a different set of rows. |
Serializable also guarantees that the result of concurrently executing transactions is equivalent to some serial execution of those transactions. A serial execution is one in which each transaction completes before the next one begins. However, Serializable does not guarantee linearizability; that is, it does not guarantee that the serial order matches the real-time order of the transactions. For example, if transaction T1 completes before transaction T2 begins in real time, the result may be equivalent to a serial execution in which T2 executes before T1.
Non-linearizable orderings are more likely to occur when querying indexes and
materialized views with large propagation delays. For example, suppose
transaction T1 queries table t and is followed in real time by transaction
T2, which queries a computationally expensive materialized view mv defined
over t. If the two transactions execute sufficiently close together, mv may
not yet reflect the latest updates to t that T1 observed, so T2 may
not observe all rows visible to T1.
Logical timestamp selection
When using the serializable
isolation level, the logical timestamp may be arbitrarily ahead of or behind the
system clock. For example, at a wall clock time of 9pm, Materialize may choose
to execute a serializable query as of logical time 8:30pm, perhaps because data
for 8:30–9pm has not yet arrived. In this scenario, now() would return 9pm,
while mz_now() would return 8:30pm.
Strict Serializable
Strict Serializable provides all the guarantees of Serializable isolation and additionally guarantees linearizability. With linearizability, the serial order matches the real-time order of the transactions. For example, if transaction T1 completes before transaction T2 begins in real time, the result is equivalent to a serial execution in which T1 executes before T2.
More concretely, suppose transaction T1 queries table t and is followed in
real time by transaction T2, which queries a computationally expensive
materialized view mv defined over t. Under Strict Serializable, T2 is
guaranteed to observe all rows visible to T1.
The linearizable guarantee applies only to transactions (including single-statement SQL queries, which are implicitly single-statement transactions), not to data written while ingesting from upstream sources.
-
If a piece of data has been fully ingested from an upstream source, it is not guaranteed to appear in the next read transaction. See real-time recency for more details.
-
However, once that data is included in the results of a read transaction, all subsequent read transactions are guaranteed to see it.
Logical timestamp selection
When using the strict
serializable isolation level,
Materialize attempts to keep the logical timestamp reasonably close to wall
clock time. In most cases, the logical timestamp of a query will be within a few
seconds of the wall clock time. For example, when executing a strict
serializable query at a wall clock time of 9pm, Materialize will choose a
logical timestamp within a few seconds of 9pm, even if data for 8:30–9pm has not
yet arrived and the query will need to block until the data for 9pm arrives. In
this scenario, both now() and mz_now() would return 9pm.
Real-time recency
To enable this feature in your Materialize region, contact our team.
Materialize offers a form of “end-to-end linearizability” known as real-time
recency. When using real-time recency, all client-issued SELECT statements
include at least all data visible to Materialize in any external source (i.e.,
sources created with CREATE SOURCE that use CONNECTIONs, such as Kafka,
MySQL, and PostgreSQL sources) after Materialize receives the query. This is
what we mean by linearizable––the results are guaranteed to contain all visible
data according to physical time.
For example, real-time recency ensures that if you have just performed an
INSERT into a PostgreSQL database that Materialize ingests as a source, all of
your real-time recency queries will include the just-written data in their
results.
Note that real-time recency only guarantees that the results will contain at least the data visible to Materialize when we receive the query. We cannot guarantee that the results will contain only the data visible when Materialize receives the query. For instance, the rate at which Materialize ingests data might include additional data made visible after the timestamp we determined to be “real-time recent.” Another example is that, due to network latency, the timestamp from the external system that we determine to be “real-time recent” might be later (i.e. include more data) than you expected.
Because Materialize waits until it ingests the data from the external system, real-time recency queries can have additional latency. This latency is introduced by both the time it takes us to ingest and commit data from the source and the time spent communicating with it to determine what data it has made available to us (e.g., querying PostgreSQL for the replication slot’s LSN).
Details
- Real-time recency is only available with sessions running at the strict serializable isolation level.
- Enable this feature per session using
SET real_time_recency = true. - Control the timeout for connecting to the external source to determine the
timestamp with the
real_time_recency_timeoutsession variable. - Real-time recency queries only guarantee visibility of data from external
systems (e.g., sources like Kafka, MySQL, and PostgreSQL). Real-time recency
queries do not offer any form of guarantee when querying Materialize-local
objects, such as
LOAD GENERATORsources or system tables. - Each real-time recency query connects to each external source transitively referenced in the query. The more external sources that are referenced, the greater the likelihood of latency caused by the network or ingestion rates.
- Materialize doesn’t currently offer a mechanism to provide a “lower bound” on the data we consider to be “real-time recent” in an external source. Real-time recency queries return at least all data visible to Materialize when our client connection communicates with the external system.
Bounded Staleness
The Bounded Staleness isolation level lets you trade exact freshness for
predictable latency. A query under bounded staleness is served at a timestamp
that is at most <duration> stale—but never blocks waiting for input
collections to catch up. If no timestamp within <duration> of “now” is
available, the query errors immediately.
This sits between Serializable (no freshness ceiling) and Strict Serializable (always freshest, may wait for inputs to advance) on the freshness/latency spectrum. What distinguishes bounded staleness is that it never waits on input frontiers — it errors instead.
Syntax
SET TRANSACTION_ISOLATION TO 'bounded staleness <duration>';
<duration> is a duration string like 5s, 500ms, or 1m 30s (compact
forms like 1m30s are also accepted). It must be greater than 0.
Bounded staleness is set through the transaction_isolation session variable,
as shown above. Because it carries a duration, it cannot be set via the
BEGIN ... ISOLATION LEVEL keyword form, which only accepts the standard
isolation-level names.
For example:
-- Read data no more than 5 seconds stale, never block.
SET TRANSACTION_ISOLATION TO 'bounded staleness 5s';
-- A tighter bound for dashboards.
SET TRANSACTION_ISOLATION TO 'bounded staleness 250ms';
Behavior
When a query runs under bounded staleness, Materialize picks the freshest
timestamp at which every queried collection has data, subject to the constraint
that the timestamp is no more than <duration> stale relative to the current
logical time. The query never blocks waiting for sources, materialized views,
or indexes to advance. If even the freshest available timestamp is more than
<duration> stale, the query errors with SQLSTATE 40001
(serialization_failure):
ERROR: cannot serve query under bounded staleness 5s; freshest available
timestamp is 7000ms older than the bound
40001 is a serialization failure; how to react is up to the application.
Common responses include retrying, falling back to a different isolation level,
or surfacing a “data unavailable” state.
Restrictions
-
Read-only. Writes (
INSERT,UPDATE,DELETE,COPY FROM) are not permitted under bounded staleness. The first write in a session running at this isolation level errors. -
Mutually exclusive with
real_time_recency. Setting both errors at session-variable validation time. -
Single timeline. Bounded staleness only applies to queries on the standard wall-clock timeline. Queries that touch other timelines error at planning time.
-
Bound must be greater than
0. A bound of zero is rejected; use Strict Serializable if you need exact freshness.
When to use Bounded Staleness
Pick bounded staleness when predictable latency matters more than perfect freshness, and “data is too stale” is a more useful signal than blocking. Common cases: dashboards, metric panels, alert evaluation.
-
Prefer Strict Serializable when you need linearizable, end-to-end fresh reads, or when your application is willing to wait for the freshest data.
-
Prefer Serializable when you do not need a freshness bound at all and want the simplest, lowest-latency reads.
-
Bounded staleness is read-only: if your session needs writes, use one of the other isolation levels.
Isolation levels and query latency
Strict Serializable provides stronger consistency guarantees but may have slower reads than Serializable.
-
Strict Serializable (the default) may need to wait for recent writes to propagate through materialized views and indexes before serving a read, so that the read reflects the real-time order of transactions.
- Real-time recency (available only with Strict Serializable) introduces additional latency, since Materialize waits to determine and ingest the latest data available in upstream sources before serving the query.
-
Serializable does not wait for writes to propagate. It reads a consistent (but possibly slightly stale) snapshot, which avoids that latency at the cost of linearizability. However, if a consistent snapshot is not available, the query blocks until one becomes available.
- Bounded Staleness never waits for writes to propagate. It serves the freshest consistent snapshot within the staleness bound, and errors immediately when no such snapshot exists instead of blocking.
Setting isolation level
You can set the isolation level using the session-level configuration parameter
TRANSACTION_ISOLATION; for example:
SET TRANSACTION_ISOLATION TO 'STRICT SERIALIZABLE';
You can also set the isolation level for an explicit transaction block as part
of the BEGIN statement; for example:
BEGIN ISOLATION LEVEL STRICT SERIALIZABLE;
--- ...
--- ...
--- ...
COMMIT;
Learn more
Check out:
- PostgreSQL documentation for more information on isolation levels.
- Jepsen Consistency Models documentation for more information on consistency models.
-
Phenomenon descriptions adapted from ISO/IEC 9075-2:1999 (E), §4.32 “SQL-transactions.” ↩︎