CREATE CLUSTER

CREATE CLUSTER creates a logical cluster, which contains indexes. By default, a cluster named default with a single cluster replica will exist in every environment.

To switch your active cluster, use the SET command:

SET cluster = other_cluster;

Conceptual framework

Clusters are logical components that let you express resource isolation for all dataflow-powered objects, e.g. indexes. When creating dataflow-powered objects, you must specify which cluster you want to use. (Not explicitly naming a cluster uses your session’s default cluster.)

Importantly, clusters are strictly a logical component; they rely on cluster replicas to run dataflows. Said a slightly different way, a cluster with no replicas does no computation. For example, if you create an index on a cluster with no replicas, you cannot select from that index because there is no physical representation of the index to read from.

Though clusters only represent the logic of which objects you want to bundle together, this impacts the performance characteristics once you provision cluster replicas. Each object in a cluster gets instantiated on every replica, meaning that on a given physical replica, objects in the cluster are in contention for the same physical resources. To achieve the performance you need, this might require setting up more than one cluster.

Syntax

CREATE CLUSTER name REPLICAS ( replica_definition , )

replica_definition

replica_name ( replica_option = value , )
Field Use
name A name for the cluster.
inline_replica Any replicas you want to immediately provision.
replica_name A name for a cluster replica.

Replica options

Field Value Description
SIZE text The size of the replica. For valid sizes, see cluster replica sizes.
AVAILABILITY ZONE text If you want the replica to reside in a specific availability zone. You must specify an AWS availability zone ID in either us-east-1 or eu-west-1, e.g. use1-az1. Note that we expect the zone’s ID, rather than its name.
INTROSPECTION INTERVAL interval The interval at which to collect introspection data. See Troubleshooting for details about introspection data. The special value 0 entirely disables the gathering of introspection data. Defaults to 1s.
INTROSPECTION DEBUGGING bool Whether to introspect the gathering of the introspection data. Defaults to false.

Details

Deployment options

When building your Materialize deployment, you can change its performance characteristics by…

Action Outcome
Adding clusters + decreasing dataflow density Reduced contention among dataflows, decoupled dataflow availability
Adding replicas to clusters See Cluster replica scaling

Examples

Basic

Create a cluster with two medium replicas:

CREATE CLUSTER c1 REPLICAS (
    r1 (SIZE = 'medium'),
    r2 (SIZE = 'medium')
);

Introspection disabled

Create a cluster with a single replica with introspection disabled:

CREATE CLUSTER c REPLICAS (
    r1 (SIZE = 'xsmall', INTROSPECTION INTERVAL = 0)
);

Disabling introspection can yield a small performance improvement, but you lose the ability to run troubleshooting queries against that cluster replica.

Empty

Create a cluster with no replicas:

CREATE CLUSTER c1 REPLICAS ();

You can later add replicas to this cluster with CREATE CLUSTER REPLICA.

Back to top ↑