Guide: Ingest from a dedicated PostgreSQL replica
View as MarkdownThis guide shows you how to stand up a dedicated PostgreSQL replica using native PostgreSQL logical replication, and then point a Materialize PostgreSQL source at that replica instead of your production primary. This is sometimes referred to as the sidecar pattern.
When to use this
Materialize connects to PostgreSQL using the replication protocol, which holds a replication slot open on the upstream database. For the vast majority of deployments, connecting directly to the primary is the right choice. Consider a dedicated replica only when you have a concrete concern such as:
-
You need to isolate WAL-retention risk from the primary. A replication slot pins WAL on whichever database it lives on, so a paused or lagging Materialize source can retain WAL and risk filling that database’s disk. Pointing Materialize at a replica keeps this risk off your production primary.
-
You can’t reconfigure the primary. Enabling
wal_level = logicalrequires a restart, and you may not be able to restart or modify the primary. You can instead enable it on a replica you control. -
You can’t set
REPLICA IDENTITY FULLon the primary’s tables. Materialize requiresREPLICA IDENTITY FULLto capture all column values in change events. If you can’t alter the primary’s tables, you can set it on the replica’s copies instead. See also the PostgreSQL source FAQ.
If none of these apply, prefer connecting directly to the primary — the replica adds an extra system to operate and its own replication lag on top of the primary.
How it works
There are two logical-replication hops to set up:
primary ──(native PG logical replication)──▶ replica ──(Materialize source)──▶ Materialize
- primary → replica: native PostgreSQL logical replication keeps the replica’s tables in sync with the primary. The primary is the publisher; the replica is the subscriber.
- replica → Materialize: the replica acts as a publisher for Materialize.
This means the replica also needs
wal_level = logical, its own publication, andREPLICA IDENTITY FULLon the tables you replicate.
Prerequisites
- A PostgreSQL primary on PostgreSQL 11+, with the tables you want to replicate.
- A separate PostgreSQL instance (11+) to act as the replica, with network connectivity from the replica to the primary.
- A superuser (or a role with
REPLICATIONand the relevant table privileges) on each instance.
A. Configure the primary
1. Enable logical replication on the primary
-
Set
wal_level = logicalinpostgresql.confon the primary.wal_level = logical max_replication_slots = 10 max_wal_senders = 10NOTE: On managed PostgreSQL services, enabling logical replication may differ. For example, Amazon RDS usesrds.logical_replication = 1in the parameter group. Refer to your service’s documentation. -
Restart the primary (reloading is not sufficient).
-
After the restart, verify the setting:
SHOW wal_level; -- should return: logical
2. Create a publication and replication user on the primary
-
On the primary, create a publication for the tables to replicate:
CREATE PUBLICATION repl_to_replica FOR TABLE orders;To add more tables later, you can use
ALTER PUBLICATION repl_to_replica ADD TABLE <other_table>;. -
On the primary, create a role for the replica to connect as. The role needs
REPLICATIONandSELECTon the tables:CREATE ROLE repuser WITH REPLICATION LOGIN PASSWORD '<strong_password>'; GRANT SELECT ON orders TO repuser; -
Update the primary’s
pg_hba.confto allow the connection (update with the replica’s IP/CIDR):host all repuser <replica_ip>/32 scram-sha-256 -
Reload the updated config:
SELECT pg_reload_conf();
B. Configure the replica
1. Enable logical replication on the replica
-
Set
wal_level = logicalinpostgresql.confon the replica. The replica also requires this because it will, in turn, publish to Materialize.wal_level = logical max_replication_slots = 10 max_wal_senders = 10NOTE: On managed PostgreSQL services, enabling logical replication may differ. For example, Amazon RDS usesrds.logical_replication = 1in the parameter group. Refer to your service’s documentation. -
Restart the replica (reloading is not sufficient).
-
After the restart, verify the setting:
SHOW wal_level; -- should return: logical
2. Create the tables on the replica
Logical replication does not copy DDL, so you must create the table structure on the replica yourself. Columns are matched by name: the replica table must contain every column the publication sends, but it may also have extra columns or a different column order. Replicated columns must have compatible data types.
On the replica, recreate each table you want to replicate. For example, the
following creates the table orders on the replica:
CREATE TABLE orders (
id bigint PRIMARY KEY,
customer text,
amount numeric,
created_at timestamptz
);
3. Set REPLICA IDENTITY FULL on the replica’s tables
On the replica, set REPLICA IDENTITY FULL on each replicated table, so
that Materialize can capture all column values.
ALTER TABLE orders REPLICA IDENTITY FULL;
UPDATE/DELETE. However, Materialize requires REPLICA IDENTITY FULL to
ingest unchanged
TOASTed values.
4. Create the subscription on the replica
On the replica, create a subscription to the primary’s publication:
CREATE SUBSCRIPTION orders_sub
CONNECTION 'host=<primary_host> port=5432 dbname=<db> user=repuser password=<strong_password>'
PUBLICATION repl_to_replica;
By default, this immediately creates a replication slot on the primary, copies the existing table data, and begins streaming ongoing changes.
5. Verify the primary → replica hop
-
On the replica:
SELECT subname, subenabled FROM pg_subscription; SELECT * FROM pg_stat_subscription; -- last_msg_receipt_time should advance -
On the primary:
SELECT slot_name, active, restart_lsn FROM pg_replication_slots; SELECT * FROM pg_stat_replication; -
To test end to end:
-
Insert a row on the primary:
-- primary INSERT INTO orders VALUES (1, 'acme', 99.50, now()); -
Confirm it appears on the replica.
-- replica SELECT * FROM orders WHERE id = 1;
-
6. Create a publication for Materialize on the replica
On the replica, create a separate publication that Materialize will use
(keep it distinct from the primary → replica publication):
-- replica
CREATE PUBLICATION mz_source FOR TABLE orders;
C. Connect Materialize to the replica
At this point, you can treat the replica as you would any self-hosted PostgreSQL source. Follow the self-hosted PostgreSQL guide to:
-
Create the source using the
mz_sourcepublication from your replica:CREATE SOURCE mz_source FROM POSTGRES CONNECTION pg_replica (PUBLICATION 'mz_source') FOR ALL TABLES;
Things to watch out for
- Don’t leave orphaned replication slots. Both hops use replication slots
that pin WAL while inactive. Drop subscriptions cleanly (
DROP SUBSCRIPTION, which also drops the remote slot) rather than just disabling them, and drop the Materialize source when you no longer need it. See Replication slot is active. - DDL is not replicated. When you add a column, apply it on the replica first, then the primary, to avoid replication errors. For handling schema changes in Materialize, see Handle upstream schema changes.
- Sequences are not replicated by native logical replication — only table data.
- The replica adds lag. Changes reach Materialize only after they’ve been applied on the replica, so end-to-end latency is the primary → replica lag plus the replica → Materialize lag.