A sink describes an external system you want Materialize to write data to, and provides details about how to encode that data. To create a sink, you must specify a connector, a format and an envelope.
Materialize bundles native sink connectors for the following external systems:
For details on the syntax, supported formats and features of each connector,
check out the dedicated
CREATE SINK documentation pages.
To write to an external data sink, you must specify the format Materialize
should use to encode ouput data. This is handled by specifying a
CREATE SINK statement.
Materialize can encode output data as Avro messages, and automatically publish a schema to a schema registry based on the columns and data types in the source, table or materialized view you want to send to the sink.
Materialize can encode output data as JSON messages. Publishing schemas to a schema registry is not supported yet for JSON-formatted sinks (#7186).
In addition to determining how to encode output data, Materialize also needs to
understand the desired behaviour of sinked events in the downstream external
system. Whether a change is simply emitted as an event with a diff structure,
or actively inserts, updates, or deletes existing data downstream depends on
ENVELOPE specified in the
CREATE SINK statement.
The upsert envelope treats all records as having a key and a value, and supports inserts, updates and deletes in the sink:
If the key does not match a preexisting record downstream, it inserts the record’s key and value.
If the key matches a preexisting record downstream and the value is non-null, Materialize updates the existing record with the new value.
If the key matches a preexisting record downstream and the value is null, Materialize deletes the record.
Materialize provides a dedicated envelope that describes the decoded records' old and new values with a Debezium-like diff structure, representing inserts, updates, or deletes to the underlying source, table or materialized view being written to the sink:
beforefield is null, the record represents an insert.
afterfields are non-null, the record represents an update.
afterfield is null, the record represents a delete.
Sizing a sink
Some sinks require relatively few resources to handle data ingestion, while
others are high traffic and require hefty resource allocations. You choose the
amount of CPU and memory available to a sink using the
SIZE option, and
adjust the provisioned size after sink creation using the
ALTER SINK command.
Sinks that specify the
SIZE option are linked to a single-purpose cluster
dedicated to maintaining that sink.
You can also choose to place a sink in an existing
cluster by using the
IN CLUSTER option.
Sinks in a cluster share the resource allocation of the cluster with all other
objects in the cluster.
Colocating multiple sinks onto the same cluster can be more resource efficient when you have many low-traffic sinks that occasionally need some burst capacity.