Materialize Logo

CREATE SOURCE: PostgreSQL

BETA! This feature is in beta. It may have performance or stability issues and is not subject to our backwards compatibility guarantee.

New in v0.8.0.

CREATE SOURCE connects Materialize to an external data source and lets you interact with its data as if the data were in a SQL table.

This document details how to connect Materialize to a Postgres database for Postgres versions 10 and higher. Before you create the source in Materialize, you must perform some prerequisite steps in Postgres.

Conceptual framework

Sources represent connections to resources outside Materialize that it can read data from. For more information, see API Components: Sources.

Syntax

CREATE MATERIALIZED SOURCE IF NOT EXISTS src_name FROM POSTGRES CONNECTION connection_info PUBLICATION publication_name
Field Use
MATERIALIZED Materializes the source’s data, which retains all data in memory and makes sources directly selectable. Currently required for all Postgres sources. For more information, see Materialized source details.
src_name The name for the source.
IF NOT EXISTS Do nothing (except issuing a notice) if a source with the same name already exists. Default.
CONNECTION connection_info Postgres connection parameters. See the Postgres documentation on supported correction parameters for details.
PUBLICATION publication_name Postgres publication (the replication data set containing the tables to be streamed to Materialize).

Details

Materialized source details

Materializing a source keeps data it receives in an in-memory index, the presence of which makes the source directly queryable. In contrast, non-materialized sources cannot process queries directly; to access the data the source receives, you need to create materialized views that SELECT from the source.

For a mental model, materializing the source is approximately equivalent to creating a non-materialized source, and then creating a materialized view from all of the source’s columns:

CREATE SOURCE src ...;
CREATE MATERIALIZED VIEW src_view AS SELECT * FROM src;

The actual implementation of materialized sources differs, though, by letting you refer to the source’s name directly in queries.

For more details about the impact of materializing sources (and implicitly creating an index), see CREATE INDEX: Details — Memory footprint.

PostgreSQL source details

Materialize makes use of PostgreSQL’s native replication capabilities to create a continuously updated replica of the desired Postgres tables.

Before creating the source in Materialize, you must:

  1. Set up your Postgres database to allow logical replication.
  2. Ensure that the user for your Materialize connection has REPLICATION privileges.
  3. Create a Postgres publication, or replication data set, containing the tables to be streamed to Materialize. Since Postgres sources are materialized (kept in memory) in their entirety, we strongly recommend that you limit publications only to the data you need to query.

Once you create a materialized source from the publication, the source will contain the raw data stream of replication updates. You can then break the stream out into views that represent the publication’s original tables with CREATE VIEWS. You can treat these tables as you would any other source and create other views or materialized views from them.

Postgres schemas

CREATE VIEWS will attempt to create each upstream table in the same schema as Postgres. For example, if the publication contains tables "public"."foo" and "otherschema"."foo", CREATE VIEWS is the equivalent of

CREATE VIEW "public"."foo";
CREATE VIEW "otherschema"."foo";

Therefore, in order for CREATE VIEWS to succeed, all upstream schemas included in the publication must exist in Materialize as well, or you must explicitly specify the downstream schemas and rename the resulting tables. For example:

CREATE VIEWS FROM "mz_source"
("public"."foo" AS "foo", "otherschema"."foo" AS "foo2");

Postgres replication slots

WARNING! Make sure to delete any replication slots if you stop using Materialize or if either your Materialize or Postgres instances crash.

If you stop or delete Materialize without first dropping the Postgres source, the Postgres replication slot isn’t deleted and will continue to accumulate data. In such cases, you should manually delete the Materialize replication slot to recover memory and avoid degraded performance in the upstream database. Materialize replication slot names always begin with materialize_ for easy identification.

Restrictions on Postgres sources

Supported Postgres versions

Postgres sources in Materialize require that upstream Postgres instances be version 10 or greater.

Examples

Setting up PostgreSQL

Before you create a Postgres source in Materialize, you must complete the following prerequisite steps in Postgres.

  1. Ensure the database configuration allows logical replication. For most configurations, it should suffice to set wal_level = logical in postgresql.conf, but additional steps may be required for Amazon RDS. See the Amazon Relational Database Service Documentation for details.

  2. Assign the user REPLICATION privileges:

    ALTER ROLE "user" WITH REPLICATION;
    
  3. Set replica identity to full for all the tables that you wish to replicate:

    ALTER TABLE foo
    REPLICA IDENTITY FULL;
    
  4. Create a publication containing all the tables you wish to query in Materialize:

    For all tables in Postgres:

    CREATE PUBLICATION mz_source FOR ALL TABLES;
    

    For specific tables:

    CREATE PUBLICATION mz_source FOR TABLE table1, table2;
    

Creating a source

Once you have set up Postgres, you can create the source in Materialize.

CREATE MATERIALIZED SOURCE "mz_source" FROM POSTGRES
CONNECTION 'host=postgres port=5432 user=host sslmode=require dbname=postgres'
PUBLICATION 'mz_source';

This creates a source that…

Creating views

Once you have created the Postgres source, you need to create views that represent the upstream publication’s original tables.

Create views for all tables included in the Postgres publication

CREATE VIEWS FROM SOURCE "mz_source";
SHOW FULL VIEWS;

Create views for specific upstream tables

CREATE VIEWS FROM SOURCE "mz_source" ("a", "b");
SHOW FULL VIEWS;
Did this info help?
Yes No