CREATE CLUSTER
CREATE CLUSTER
creates a logical cluster,
which contains dataflow-powered objects. By default, a cluster named default
with a single cluster replica will exist in every environment.
To switch your active cluster, use the SET
command:
SET cluster = other_cluster;
Conceptual framework
Clusters are logical components that let you express resource isolation for all dataflow-powered objects: sources, sinks, indexes, and materialized views. When creating dataflow-powered objects, you must specify which cluster you want to use.
For indexes and materialized views, not explicitly naming a cluster uses your session’s default cluster.
A given cluster may contain any number of indexes and materialized views or any number of sources and sinks, but not both types of objects. For example, you may not create a cluster with a source and an index.
We plan to remove this restriction in a future version of Materialize.
Importantly, clusters are strictly a logical component; they rely on cluster replicas to run dataflows. Said a slightly different way, a cluster with no replicas does no computation. For example, if you create an index on a cluster with no replicas, you cannot select from that index because there is no physical representation of the index to read from.
Though clusters only represent the logic of which objects you want to bundle together, this impacts the performance characteristics once you provision cluster replicas. Each object in a cluster gets instantiated on every replica, meaning that on a given physical replica, objects in the cluster are in contention for the same physical resources. To achieve the performance you need, this might require setting up more than one cluster.
Clusters containing sources and sinks can have at most one replica.
We plan to remove this restriction in a future version of Materialize.
Syntax
replica_definition
Field | Use |
---|---|
name | A name for the cluster. |
inline_replica | Any replicas you want to immediately provision. |
replica_name | A name for a cluster replica. |
Replica options
Field | Value | Description |
---|---|---|
SIZE |
text |
The size of the replica. For valid sizes, see cluster replica sizes. |
AVAILABILITY ZONE |
text |
If you want the replica to reside in a specific availability zone. You must specify an AWS availability zone ID in either us-east-1 or eu-west-1 , e.g. use1-az1 . Note that we expect the zone’s ID, rather than its name. |
INTROSPECTION INTERVAL |
interval |
Default: 1s . The interval at which to collect introspection data. See Troubleshooting for details about introspection data. The special value 0 entirely disables the gathering of introspection data. |
INTROSPECTION DEBUGGING |
bool |
Default: false . Whether to introspect the gathering of the introspection data. |
IDLE ARRANGEMENT MERGE EFFORT |
integer |
The amount of effort the replica should exert on compacting arrangements during idle periods. This is an unstable option! It may be changed or removed at any time. |
Details
Deployment options
When building your Materialize deployment, you can change its performance characteristics by…
Action | Outcome |
---|---|
Adding clusters + decreasing dataflow density | Reduced contention among dataflows, decoupled dataflow availability |
Adding replicas to clusters | See Cluster replica scaling |
Examples
Basic
Create a cluster with two medium replicas:
CREATE CLUSTER c1 REPLICAS (
r1 (SIZE = 'medium'),
r2 (SIZE = 'medium')
);
Introspection disabled
Create a cluster with a single replica with introspection disabled:
CREATE CLUSTER c REPLICAS (
r1 (SIZE = 'xsmall', INTROSPECTION INTERVAL = 0)
);
Disabling introspection can yield a small performance improvement, but you lose the ability to run troubleshooting queries against that cluster replica.
Empty
Create a cluster with no replicas:
CREATE CLUSTER c1 REPLICAS ();
You can later add replicas to this cluster with CREATE CLUSTER REPLICA
.