Materialize Documentation
s
Join the Community github/materialize

System catalog

New in v0.5.0.

Materialize exposes a system catalog that contains metadata about the running Materialize instance.

WARNING! The system tables are not part of Materialize’s stable interface. Backwards-incompatible changes to these tables may be made at any time.

Details

The system catalog consists of two schemas that are implicitly available in all databases:

These schemas contain sources, tables, and views that expose metadata like:

Whenever possible, applications should prefer to query mz_catalog over pg_catalog. The mapping between Materialize concepts and PostgreSQL concepts is not one-to-one, and so the data in pg_catalog cannot accurately represent the particulars of Materialize. For example, PostgreSQL has no notion of sinks, and therefore pg_catalog does not display information about the sinks available in a Materialize.

mz_catalog

The following sections describe the available objects in the mz_catalog schema.

mz_array_types

The mz_array_types table contains a row for each array type in the system.

Field Type Meaning
type_id text The ID of the array type.
element_id text The ID of the array’s element type.

mz_arrangement_sharing

The mz_arrangement_sharing source describes how many times each arrangement in the system is used.

Field Type Meaning
operator bigint The ID of the operator that created the arrangement. Corresponds to mz_dataflow_operators.id.
worker bigint The ID of the worker thread hosting the arrangement.
count bigint The number of operators that share the arrangement.

mz_arrangement_sizes

The mz_arrangement_sizes source describes the size of each arrangement in the system.

Field Type Meaning
operator bigint The ID of the operator that created the arrangement. Corresponds to mz_dataflow_operators.id.
worker bigint The worker hosting the arrangement.
records bigint The number of records in the arrangement.
batches bigint The number of batches in the arrangement.

mz_avro_ocf_sinks

The mz_avro_ocf_sinks table contains a row for each Avro OCF sink in the system.

Field Type Meaning
sink_id text The ID of the sink.
path bytea The path to the Avro OCF file into which the sink is writing.

mz_base_types

The mz_base_types table contains a row for each base type in the system.

Field Type Meaning
type_id text The ID of the type.

mz_columns

The mz_columns contains a row for each column in each table, source, and view in the system.

Field Type Meaning
id bigint The unique ID of the table, source, or view containing the column.
name text The name of the column.
position bigint The 1-indexed position of the column in its containing table, source, or view.
nullable boolean Can the column contain a NULL value?
type text The data type of the column.
default text The default expression of the column.
type_oid oid The OID of the type of the column (references mz_types).

mz_databases

The mz_databases table contains a row for each database in the system.

Field Type Meaning
id bigint Materialize’s unique ID for the database.
oid oid A PostgreSQL-compatible OID for the database.
name text The name of the database.

mz_dataflow_channels

The mz_dataflow_channels source describes the communication channels between dataflow operators. A communication channel connects one of the outputs of a source operator to one of the inputs of a target operator.

Field Type Meaning
id bigint The ID of the channel.
worker bigint The ID of the worker thread hosting the channel.
source_node bigint The ID of the source operator. Corresponds to mz_dataflow_operators.id.
source_port bigint The source operator’s output port.
target_node bigint The ID of the target operator. Corresponds to mz_dataflow_operators.id.
target_port bigint The target operator’s input port.

mz_dataflow_names

The mz_dataflow_names view describes the dataflows in the system.

Field Type Meaning
id bigint The ID of the dataflow.
worker bigint The ID of the worker thread hosting the dataflow.
local_id bigint The scope-local index of the dataflow.
name text The internal name of the dataflow.

mz_dataflow_operator_addresses

The mz_dataflow_operator_addresses source describes how the dataflow channels and operators in the system are nested into scopes.

Field Type Meaning
id bigint The ID of the channel or operator. Corresponds to mz_dataflow_channels.id or mz_dataflow_operators.id.
worker bigint The ID of the worker thread hosting the channel or operator.
address bigint list A list of scope-local indexes indicating the path from the root to this channel or operator.

mz_dataflow_operator_dataflows

The mz_dataflow_operator_dataflows view describes the dataflow to which each dataflow operator belongs.

Field Type Meaning
id bigint The ID of the operator.
name text The internal name of the operator.
worker bigint The ID of the worker thread hosting the operator.
dataflow_id bigint The ID of the dataflow hosting the operator.
dataflow_name text The name of the dataflow hosting the operator.

mz_dataflow_operators

The mz_dataflow_operators source describes the dataflow operators in the system.

Field Type Meaning
id bigint The ID of the operator.
worker bigint The ID of the worker thread hosting the operator.
name text The name of the operator.

mz_functions

The mz_functions table contains a row for each function in the system.

Field Type Meaning
id text Materialize’s unique ID for the function.
oid oid A PostgreSQL-compatible OID for the function.
schema_id bigint The ID of the schema to which the function belongs.
name text The name of the function.
arg_ids text array The function’s arguments' types. Elements refers to mz_types.id.
variadic_id text The variadic array parameter’s elements, or NULL if the function does not have a variadic parameter. Refers to mz_types.id.
ret_id text The returned value’s type, or NULL if the function does not return a value. Refers to mz_types.id. Note that for table functions with > 1 column, this type corresponds to [record].
ret_set [bool] Whether the returned value is a set, i.e. the function is a table function.

mz_indexes

The mz_indexes table contains a row for each index in the system.

Field Type Meaning
id text Materialize’s unique ID for the index.
oid oid A PostgreSQL-compatible OID for the index.
name text The name of the index.
on_id text The ID of the relation on which the index is built.
volatility text Whether the index is volatile. Either volatile, nonvolatile, or unknown.
enabled [bool] Whether or not the index represents an arrangement. false only in the case of Disabling user indexes.

mz_index_columns

The mz_index_columns table contains a row for each column in each index in the system. For example, an index on (a, b + 1) would have two rows in this table, one for each of the two columns in the index.

For a given row, if field_number is null then expression will be nonnull, or vice-versa.

Field Type Meaning
index_id text The ID of the index which contains this column.
index_position bigint The 1-indexed position of this column within the index. (The order of columns in an index does not necessarily match the order of columns in the relation on which the index is built.)
on_position bigint If not NULL, specifies the 1-indexed position of a column in the relation on which this index is built that determines the value of this index column.
on_expression text If not NULL, specifies a SQL expression that is evaluated to compute the value of this index column. The expression may contain references to any of the columns of the relation.
nullable boolean Can this column of the index evaluate to NULL?

mz_kafka_source_statistics

The mz_kafka_source_statistics table contains statistics for Kafka sources from the underlying librdkafka library.

Note that the contents of this table may evolve without warning when Materialize updates the version of librdkafka in use.

Field Type Meaning
source_id text The ID of the source. Corresponds to mz_source_info.source_id.
worker_id bigint The ID of the worker thread hosting the dataflow.
statistics jsonb A JSON object containing the statistics. See STATISTICS.md in the librdkafka documentation for details.

mz_kafka_sinks

The mz_kafka_sinks table contains a row for each Kafka sink in the system.

Field Type Meaning
sink_id text The ID of the sink.
topic text The name of the Kafka topic into which the sink is writing.
consistency_topic text The name of the Kafka topic into which the sink is writing consistency information. This is NULL when the sink does not write consistency information.

mz_list_types

The mz_list_types table contains a row for each list type in the system.

Field Type Meaning
type_id text The ID of the list type.
element_id text The IID of the list’s element type.

mz_message_counts

The mz_message_counts source describes the messages sent and received over the dataflow channels in the system.

Field Type Meaning
channel bigint The ID of the channel. Corresponds to mz_dataflow_channels.id.
source_worker bigint The ID of the worker thread sending the message.
target_worker bigint The ID of the worker thread receiving the message.
sent bigint The number of messages sent.
received bigint The number of messages received.

mz_materialization_dependencies

The mz_materialization_dependencies source describes the sources that each dataflow depends on.

Field Type Meaning
dataflow text The ID of the index that created the dataflow. Corresponds to mz_indexes.id.
source text The ID of the source. Corresponds to mz_sources.id.
worker bigint The ID of the worker thread hosting the dataflow.

mz_materialization_frontiers

The mz_materialization_frontiers view describes the frontier for each dataflow in the system across all workers. The frontier describes the earliest timestamp at which the output of the dataflow may change; data prior to that timestamp is sealed.

For per-worker frontier information, see mz_worker_materialization_frontiers.

Field Type Meaning
global_id text The ID of the index that created the dataflow. Corresponds to mz_indexes.id.
time bigint The next timestamp at which the materialization may change.

mz_materializations

The mz_materializations source describes the indexes in the system.

Field Type Meaning
name text The ID of the index. (name is a misnomer.)
worker bigint The ID of the worker thread hosting the index.

mz_map_types

The mz_map_types table contains a row for each map type in the system.

Field Type Meaning
type_id text The ID of the map type.
key_id text The ID of the map’s key type.
value_id text The ID of the map’s value type.

mz_metrics

The mz_metrics table contains recordings for each Prometheus metric in the system that is a counter or gauge. The values are recorded every 30 seconds and retained for five minutes.

Field Type Meaning
metric text The name of the metric.
time timestamp with time zone The time at which the metric was recorded.
labels jsonb The metric’s labels and their associated values as a JSON object.
value double precision The value of the counter or gauge.

mz_metrics_meta

The mz_metrics_meta table describes metadata about each Prometheus metric in the system.

Field Type Meaning
metric text The name of the metric.
type text The type of the metric (counter, gauge, or histogram).
help text The description of the metric.

mz_metric_histograms

The mz_metric_histograms table contains recordings for each Prometheus metric in the system that is a histogram. The values are recorded every 30 seconds and retained for five minutes. Each row of the table represents one bucket of one histogram from one recording. Note that Prometheus histograms are cumulative.

Field Type Meaning
metric text The name of the metric.
time timestamp The time at which the metric was recor
labels jsonb The metric’s labels and their associated values as a JSON object.
bound double precision The upper bound of the bucket.
count bigint The (cumulative) count of observations in the bucket.

mz_objects

The mz_objects view contains a row for each table, source, view, sink, and index in the system.

Field Type Meaning
id text Materialize’s unique ID for the object.
oid oid A PostgreSQL-compatible OID for the object.
schema_id bigint The ID of the schema to which the object belongs.
name text The name of the object.
type text The type of the object: either table, source, view, sink, or index.

mz_peek_active

The mz_peek_active source describes all read queries (“peeks”) that are pending in the dataflow layer.

Field Type Meaning
id uuid The ID of the peek request.
worker bigint The ID of the worker thread servicing the peek.
index_id text The ID of the index the peek is targeting.
time bigint The timestamp the peek has requested.

mz_peek_durations

The mz_peek_durations source describes a histogram of the duration of read queries (“peeks”) in the dataflow layer.

Field Type Meaning
worker bigint The ID of the worker thread servicing the peek.
duration_ns bigint The upper bound of the bucket in nanoseconds.
count bigint The (noncumulative) count of peeks in this bucket.

mz_perf_dependency_frontiers

The mz_perf_dependency_frontiers view describes the lag between a dataflow’s outputs and its sources. It contains a row for every dataflow in the system that depends on at least one source.

Field Type Meaning
dataflow text The name of the dataflow hosting the source.
source text The name of the source.
lag_ms bigint The amount of lag between when a record is emitted by the source and when the output of the dataflow reflects that record.

mz_pseudo_types

The mz_pseudo_types table contains a row for each pseudo type in the system.

Field Type Meaning
type_id text The ID of the type.

mz_records_per_dataflow

The mz_records_per_dataflow view describes the number of records in each dataflow on each worker in the system.

For the same information aggregated across all workers, see mz_records_per_dataflow_global.

Field Type Meaning
id bigint The ID of the dataflow. Corresponds to mz_dataflow_names.id.
name text The internal name of the dataflow.
worker bigint The ID of the worker thread hosting the dataflow.
records numeric The number of records in the dataflow.

mz_records_per_dataflow_global

The mz_records_per_dataflow_global view describes the number of records in each dataflow in the system.

For the same information broken down across workers, see mz_records_per_dataflow.

Field Type Meaning
id bigint The ID of the dataflow. Corresponds to mz_dataflow_names.id.
name text The internal name of the dataflow.
records numeric The number of records in the dataflow.

mz_records_per_dataflow_operator

The mz_records_per_dataflow_operator view describes the number of records in each dataflow operator in the system.

Field Type Meaning
id bigint The ID of the operator. Corresponds to mz_dataflow_operators.id.
name text The internal name of the dataflow.
worker bigint The ID of the worker thread hosting the dataflow.
dataflow_id bigint The ID of the dataflow. Corresponds to mz_dataflow_names.id.
records numeric The number of records in the dataflow.

mz_relations

The mz_relations view contains a row for each table, source, and view in the system.

Field Type Meaning
id text Materialize’s unique ID for the relation.
oid oid A PostgreSQL-compatible OID for the relation.
schema_id bigint The ID of the schema to which the relation belongs.
name text The name of the relation.
type text The type of the relation: either table, source, or view.

mz_roles

The mz_roles table contains a row for each role in the system.

Field Type Meaning
id bigint Materialize’s unique ID for the role.
oid oid A PostgreSQL-compatible OID for the role.
name text The name of the role.

mz_scheduling_elapsed

The mz_scheduling_elapsed source describes the total amount of time spent in each dataflow operator.

Field Type Meaning
id bigint The ID of the operator. Corresponds to mz_dataflow_operators.id.
worker bigint The ID of the worker thread hosting the operator.
elapsed_ns bigint The total elapsed time spent in the operator in nanoseconds.

mz_scheduling_histogram

The mz_scheduling_histogram source stores a histogram describing the duration of each invocation for each dataflow operator.

Field Type Meaning
id bigint The ID of the operator. Corresponds to mz_dataflow_operators.id.
worker bigint The ID of the worker thread hosting the operator.
duration_ns bigint The upper bound of the bucket.
count bigint The number of recordings in the bucket.

mz_scheduling_parks

The mz_scheduling_parks source stores a histogram describing dataflow worker park events. A park event occurs when a worker has no outstanding work.

Field Type Meaning
worker bigint The ID of the worker thread.
slept_for bigint The actual length of the park event.
requested bigint The requested length of the park event.
count bigint The number of park events in this bucket.

mz_schemas

The mz_schemas table contains a row for each schema in the system.

Field Type Meaning
id bigint Materialize’s unique ID for the schema.
oid oid A PostgreSQL-compatible oid for the schema.
database_id bigint The ID of the database containing the schema.
name text The name of the schema.

mz_sinks

The mz_sinks table contains a row for each sink in the system.

Field Type Meaning
id text Materialize’s unique ID for the sink.
oid oid A PostgreSQL-compatible OID for the sink.
schema_id bigint The ID of the schema to which the sink belongs.
name text The name of the sink.
connector_type text The type of the sink: avro-ocf or kafka.
volatility text Whether the sink is volatile. Either volatile, nonvolatile, or unknown.

mz_source_info

The mz_source_info table contains a row for each partition of each source in the system.

Field Type Meaning
source_name text Materialize’s internal name for the source.
source_id text Materialize’s unique ID for the source. Corresponds to mz_sources.id.
dataflow_id bigint The ID of the dataflow responsible for processing this source.
partition_id text The ID of the partition within the source. The concept of partition varies by source type. Not all sources types have multiple partitions, in which case there will be only one entry for partition ID 0.
offset bigint The highest offset processed by this source.
timestamp bigint The largest mz_timestamp processed by this source.

mz_sources

The mz_sources table contains a row for each source in the system.

Field Type Meaning
id text Materialize’s unique ID for the source.
oid oid A PostgreSQL-compatible OID for the source.
schema_id bigint The ID of the schema to which the source belongs.
name text The name of the source.
connector_type text The type of the source: avro-ocf, file, kafka, kinesis, s3, postgres, or pubnub.
volatility text Whether the source is volatile. Either volatile, nonvolatile, or unknown.

mz_tables

The mz_tables table contains a row for each table in the system.

Field Type Meaning
id text Materialize’s unique ID for the table.
oid oid A PostgreSQL-compatible OID for the table.
schema_id bigint The ID of the schema to which the table belongs.
name text The name of the table.
persisted_name text The name of the table’s persisted materialization, or NULL if the table is not being persisted.

mz_types

The mz_types table contains a row for each type in the system.

Field Type Meaning
id text Materialize’s unique ID for the type.
oid oid A PostgreSQL-compatible OID for the type.
schema_id bigint The ID of the schema to which the type belongs.
name text The name of the type.

mz_views

The mz_views table contains a row for each view in the system.

Field Type Meaning
id text Materialize’s unique ID for the view.
oid oid A PostgreSQL-compatible OID for the view.
schema_id bigint The ID of the schema to which the view belongs.
name text The name of the view.
volatility text Whether the view is volatile. Either volatile, nonvolatile, or unknown.
definition text The view definition (a SELECT query).

mz_worker_materialization_frontiers

The mz_worker_materialization_frontiers source describes each worker’s frontier for each dataflow in the system. The frontier describes the earliest timestamp at which the output of the dataflow may change; data prior to that timestamp is sealed.

For frontier information aggregated across all workers, see mz_materialization_frontiers.

Field Type Meaning
global_id text The ID of the index that created the dataflow. Corresponds to mz_indexes.id.
worker bigint The ID of the worker thread hosting the dataflow.
time bigint The next timestamp at which the dataflow may change.

pg_catalog

Materialize has compatibility shims for the following relations from PostgreSQL’s system catalog:

These compatibility shims are largely incomplete. Most are lacking some columns that are present in PostgreSQL, or if they do include the column the result set its value may always be NULL. The precise nature of the incompleteness is intentionally undocumented. New tools developed against Materialize should use the documented mz_catalog API instead.

If you are having trouble making a PostgreSQL tool work with Materialize, please file a GitHub issue. Many PostgreSQL tools can be made to work with Materialize with minor changes to the pg_catalog compatibility shim.

information_schema

Materialize has compatibility shims for the following relations from the SQL standard information_schema schema, which is automatically available in all databases:

These compatibility shims are largely incomplete. Most are lacking some columns that are present in the SQL standard, or if they do include the column the result set its value may always be NULL. The precise nature of the incompleteness is intentionally undocumented. New tools developed against Materialize should use the documented mz_catalog API instead.