CREATE CLUSTER

CREATE CLUSTER creates a logical cluster, which contains dataflow-powered objects. By default, a cluster named default with a single cluster replica will exist in every environment.

To switch your active cluster, use the SET command:

SET cluster = other_cluster;

Conceptual framework

Clusters are logical components that let you express resource isolation for all dataflow-powered objects: sources, sinks, indexes, and materialized views. When creating dataflow-powered objects, you must specify which cluster you want to use.

For indexes and materialized views, not explicitly naming a cluster uses your session’s default cluster.

WARNING!

A given cluster may contain any number of indexes and materialized views or any number of sources and sinks, but not both types of objects. For example, you may not create a cluster with a source and an index.

We plan to remove this restriction in a future version of Materialize.

Importantly, clusters are strictly a logical component; they rely on cluster replicas to run dataflows. Said a slightly different way, a cluster with no replicas does no computation. For example, if you create an index on a cluster with no replicas, you cannot select from that index because there is no physical representation of the index to read from.

Though clusters only represent the logic of which objects you want to bundle together, this impacts the performance characteristics once you provision cluster replicas. Each object in a cluster gets instantiated on every replica, meaning that on a given physical replica, objects in the cluster are in contention for the same physical resources. To achieve the performance you need, this might require setting up more than one cluster.

WARNING!

Clusters containing sources and sinks can have at most one replica.

We plan to remove this restriction in a future version of Materialize.

Syntax

CREATE CLUSTER name REPLICAS ( replica_definition , )

replica_definition

replica_name ( replica_option = value , )
Field Use
name A name for the cluster.
inline_replica Any replicas you want to immediately provision.
replica_name A name for a cluster replica.

Replica options

Field Value Description
SIZE text The size of the replica. For valid sizes, see cluster replica sizes.
AVAILABILITY ZONE text If you want the replica to reside in a specific availability zone. You must specify an AWS availability zone ID in either us-east-1 or eu-west-1, e.g. use1-az1. Note that we expect the zone’s ID, rather than its name.
INTROSPECTION INTERVAL interval Default: 1s. The interval at which to collect introspection data. See Troubleshooting for details about introspection data. The special value 0 entirely disables the gathering of introspection data.
INTROSPECTION DEBUGGING bool Default: false. Whether to introspect the gathering of the introspection data.
IDLE ARRANGEMENT MERGE EFFORT integer The amount of effort the replica should exert on compacting arrangements during idle periods. This is an unstable option! It may be changed or removed at any time.

Details

Deployment options

When building your Materialize deployment, you can change its performance characteristics by…

Action Outcome
Adding clusters + decreasing dataflow density Reduced contention among dataflows, decoupled dataflow availability
Adding replicas to clusters See Cluster replica scaling

Examples

Basic

Create a cluster with two medium replicas:

CREATE CLUSTER c1 REPLICAS (
    r1 (SIZE = 'medium'),
    r2 (SIZE = 'medium')
);

Introspection disabled

Create a cluster with a single replica with introspection disabled:

CREATE CLUSTER c REPLICAS (
    r1 (SIZE = 'xsmall', INTROSPECTION INTERVAL = 0)
);

Disabling introspection can yield a small performance improvement, but you lose the ability to run troubleshooting queries against that cluster replica.

Empty

Create a cluster with no replicas:

CREATE CLUSTER c1 REPLICAS ();

You can later add replicas to this cluster with CREATE CLUSTER REPLICA.

Back to top ↑