Materialize v0.27

v0.27.0 is the first cloud-native release of Materialize. It contains substantial breaking changes from v0.26 LTS.

v0.27.0

Add clusters and cluster replicas, which together allocate isolated, highly available, and horizontally scalable compute resources that incrementally maintain a “cluster” of indexes.
Add materialized views, which are views that are persisted in durable storage and incrementally updated as new data arrives.

A materialized view is created in a cluster that is tasked with keeping its results up-to-date, but can be referenced in any cluster.

The result of a materialized view is not maintained in memory, unless you create an index on it. However, intermediate state necessary for efficient incremental updates of the materialized view may be maintained in memory.
Add connections, which describe how to connect to and authenticate with external systems. Once created, a connection is reusable across multiple CREATE SOURCE and CREATE SINK statements.
Add secrets, which securely store sensitive credentials (like passwords and SSL keys) for reference in connections.
Durably record data ingested from sources.

Once a source has acknowledged data upstream (e.g., via committing a Kafka offset or advancing a PostgreSQL replication slot), it will never re-read that data. As a result, PostgreSQL sources no longer have a “single materialization” limitation. All sources are directly queryable via SELECT.
Allow provisioning the size of a source or sink.

Each source and sink now runs with an isolated set of compute resources. You can adjust the size of the resource allocation with the SIZE parameter.
Add load generator sources, which produce synthetic data for use in demos and performance tests.
Add an HTTP API which supports executing SQL queries over HTTP.
Breaking change. Require all indexes to be associated with a cluster.
Breaking change. Require the use of connections with Kafka sources, PostgreSQL sources, and Kafka sinks.
Breaking change. Rename TAIL to SUBSCRIBE.
Breaking change. Change the meaning of CREATE MATERIALIZED VIEW.

CREATE MATERIALIZED VIEW now creates a new type of object called a materialized view, rather than providing a shorthand for creating a view with a default index.

To emulate the old behavior, explicitly create a default index after creating a view:
```
CREATE VIEW <name> ...;
CREATE DEFAULT INDEX ON <name>;
```
Breaking change. Remove the MATERIALIZED option from CREATE SOURCE.

CREATE MATERIALIZED SOURCE is no longer shorthand for creating a source with a default index. Instead, you must explicitly create a default index after creating a source:
```
CREATE SOURCE <name> ...;
CREATE DEFAULT INDEX ON <name>;
```
Breaking change. Remove support for the following source types:
- PubNub
- Kinesis
- S3
Breaking change. Remove the reuse_topic option from Kafka sinks.

The exactly-once semantics enabled by reuse_topic are now on by default.
Breaking change. Remove the consistency_topic option from Kafka sinks.
Breaking change. Do not default to the Debezium envelope in CREATE SINK. You must explicitly specify the envelope to use.
Breaking change. Remove the CREATE VIEWS statement, which was used to separate the data in a PostgreSQL source into a single relation per upstream table.

The PostgreSQL source now automatically creates a relation in Materialize for each upstream table.
Breaking change. Overhaul the system catalog.

The relations in the mz_catalog schema have been adjusted substantially to support the above changes. Many column and relation names were adjusted for consistency. The resulting relations are now part of Materialize’s stable interface.

Relations which were not ready for stabilization were moved to a new mz_internal schema.
Breaking change. Rename mz_logical_timestamp() to mz_now.

Upgrade guide

Following are several examples of how to adapt source and view definitions from Materialize v0.26 LTS for Materialize v0.27:

Authenticated Kafka source

Change from:

CREATE SOURCE kafka_sasl
  FROM KAFKA BROKER 'broker.tld:9092' TOPIC 'top-secret' WITH (
      security_protocol = 'SASL_SSL',
      sasl_mechanisms = 'PLAIN',
      sasl_username = '<BROKER_USERNAME>',
      sasl_password = '<BROKER_PASSWORD>'
  )
  FORMAT AVRO USING CONFLUENT SCHEMA REGISTRY 'https://schema-registry.tld' WITH (
      username = '<SCHEMA_REGISTRY_USERNAME>',
      password = '<SCHEMA_REGISTRY_PASSWORD>'
  );

to:

CREATE SECRET kafka_password AS '<BROKER_PASSWORD>';
CREATE SECRET csr_password AS '<SCHEMA_REGISTRY_PASSWORD>';

CREATE CONNECTION kafka FOR KAFKA
    BROKER 'broker.tld:9092',
    SASL MECHANISMS 'PLAIN',
    SASL USERNAME 'materialize',
    SASL PASSWORD SECRET kafka_password;

CREATE CONNECTION csr
  FOR CONFLUENT SCHEMA REGISTRY
    USERNAME = '<SCHEMA_REGISTRY_USERNAME>',
    PASSWORD = SECRET csr_password;

CREATE SOURCE kafka_top_secret
  FROM KAFKA CONNECTION kafka TOPIC ('top-secret')
  FORMAT AVRO USING CONFLUENT SCHEMA REGISTRY CONNECTION csr
  WITH (SIZE = '3xsmall');

Materialized view

Change from:

CREATE MATERIALIZED VIEW v AS SELECT ...

to:

CREATE VIEW v AS SELECT ...
CREATE DEFAULT INDEX ON v

Materialized source

Change from:

CREATE MATERIALIZED SOURCE src ...

to:

CREATE SOURCE src ...

If you are performing point lookups on src directly, consider building an index on src directly:

CREATE INDEX on src (lookup_col1, lookup_col2)

`TAIL`

Change from:

COPY (TAIL t) TO STDOUT

to:

COPY (SUBSCRIBE t) TO STDOUT