mz_introspection

The following sections describe the available objects in the mz_introspection schema.

WARNING! The objects in the mz_introspection schema are not part of Materialize’s stable interface. Backwards-incompatible changes to these objects may be made at any time.
WARNING! SELECT statements may reference these objects, but creating views that reference these objects is not allowed.

Introspection relations are maintained by independently collecting internal logging information within each of the replicas of a cluster. Thus, in a multi-replica cluster, queries to these relations need to be directed to a specific replica by issuing the command SET cluster_replica = <replica_name>. Note that once this command is issued, all subsequent SELECT queries, for introspection relations or not, will be directed to the targeted replica. Replica targeting can be cancelled by issuing the command RESET cluster_replica.

For each of the below introspection relations, there exists also a variant with a _per_worker name suffix. Per-worker relations expose the same data as their global counterparts, but have an extra worker_id column that splits the information by Timely Dataflow worker.

mz_active_peeks

The mz_active_peeks view describes all read queries (“peeks”) that are pending in the dataflow layer.

Field Type Meaning
id uuid The ID of the peek request.
object_id text The ID of the collection the peek is targeting. Corresponds to mz_catalog.mz_indexes.id, mz_catalog.mz_materialized_views.id, mz_catalog.mz_sources.id, or mz_catalog.mz_tables.id.
type text The type of the corresponding peek: index if targeting an index or temporary dataflow; persist for a source, materialized view, or table.
time mz_timestamp The timestamp the peek has requested.

mz_arrangement_sharing

The mz_arrangement_sharing view describes how many times each arrangement in the system is used.

Field Type Meaning
operator_id uint8 The ID of the operator that created the arrangement. Corresponds to mz_dataflow_operators.id.
count bigint The number of operators that share the arrangement.

mz_arrangement_sizes

The mz_arrangement_sizes view describes the size of each arrangement in the system.

The size, capacity, and allocations are an approximation, which may underestimate the actual size in memory. Specifically, reductions can use more memory than we show here.

Field Type Meaning
operator_id uint8 The ID of the operator that created the arrangement. Corresponds to mz_dataflow_operators.id.
records numeric The number of records in the arrangement.
batches numeric The number of batches in the arrangement.
size numeric The utilized size in bytes of the arrangement.
capacity numeric The capacity in bytes of the arrangement. Can be larger than the size.
allocations numeric The number of separate memory allocations backing the arrangement.

mz_compute_error_counts

The mz_compute_error_counts view describes the counts of errors in objects exported by dataflows in the system.

Dataflow exports that don’t have any errors are not included in this view.

Field Type Meaning
export_id text The ID of the dataflow export. Corresponds to mz_compute_exports.export_id.
count numeric The count of errors present in this dataflow export.

mz_compute_exports

The mz_compute_exports view describes the objects exported by dataflows in the system.

Field Type Meaning
export_id text The ID of the index, materialized view, or subscription exported by the dataflow. Corresponds to mz_catalog.mz_indexes.id, mz_catalog.mz_materialized_views.id, or mz_internal.mz_subscriptions.
dataflow_id uint8 The ID of the dataflow. Corresponds to mz_dataflows.id.

mz_compute_frontiers

The mz_compute_frontiers view describes the frontier of each dataflow export in the system. The frontier describes the earliest timestamp at which the output of the dataflow may change; data prior to that timestamp is sealed.

Field Type Meaning
export_id text The ID of the dataflow export. Corresponds to mz_compute_exports.export_id.
time mz_timestamp The next timestamp at which the dataflow output may change.

mz_compute_import_frontiers

The mz_compute_import_frontiers view describes the frontiers of each dataflow import in the system. The frontier describes the earliest timestamp at which the input into the dataflow may change; data prior to that timestamp is sealed.

Field Type Meaning
export_id text The ID of the dataflow export. Corresponds to mz_compute_exports.export_id.
import_id text The ID of the dataflow import. Corresponds to mz_catalog.mz_sources.id or mz_catalog.mz_tables.id or mz_compute_exports.export_id.
time mz_timestamp The next timestamp at which the dataflow input may change.

mz_compute_operator_durations_histogram

The mz_compute_operator_durations_histogram view describes a histogram of the duration in nanoseconds of each invocation for each dataflow operator.

Field Type Meaning
id uint8 The ID of the operator. Corresponds to mz_dataflow_operators.id.
duration_ns uint8 The upper bound of the duration bucket in nanoseconds.
count numeric The (noncumulative) count of invocations in the bucket.

mz_dataflows

The mz_dataflows view describes the dataflows in the system.

Field Type Meaning
id uint8 The ID of the dataflow.
name text The internal name of the dataflow.

mz_dataflow_addresses

The mz_dataflow_addresses view describes how the dataflow channels and operators in the system are nested into scopes.

Field Type Meaning
id uint8 The ID of the channel or operator. Corresponds to mz_dataflow_channels.id or mz_dataflow_operators.id.
address bigint list A list of scope-local indexes indicating the path from the root to this channel or operator.

mz_dataflow_arrangement_sizes

The mz_dataflow_arrangement_sizes view describes the size of arrangements per operators under each dataflow.

Field Type Meaning
id uint8 The ID of the dataflow. Corresponds to mz_dataflows.id.
name text The name of the dataflow.
records numeric The number of records in all arrangements in the dataflow.
batches numeric The number of batches in all arrangements in the dataflow.
size numeric The utilized size in bytes of the arrangements.
capacity numeric The capacity in bytes of the arrangements. Can be larger than the size.
allocations numeric The number of separate memory allocations backing the arrangements.

mz_dataflow_channels

The mz_dataflow_channels view describes the communication channels between dataflow operators. A communication channel connects one of the outputs of a source operator to one of the inputs of a target operator.

Field Type Meaning
id uint8 The ID of the channel.
from_index uint8 The scope-local index of the source operator. Corresponds to mz_dataflow_addresses.address.
from_port uint8 The source operator’s output port.
to_index uint8 The scope-local index of the target operator. Corresponds to mz_dataflow_addresses.address.
to_port uint8 The target operator’s input port.

mz_dataflow_channel_operators

The mz_dataflow_channel_operators view associates dataflow channels with the operators that are their endpoints.

Field Type Meaning
id uint8 The ID of the channel. Corresponds to mz_dataflow_channels.id.
from_operator_id uint8 The ID of the source of the channel. Corresponds to mz_dataflow_operators.id.
from_operator_address uint8 list The address of the source of the channel. Corresponds to mz_dataflow_addresses.address.
to_operator_id uint8 The ID of the target of the channel. Corresponds to mz_dataflow_operators.id.
to_operator_address uint8 list The address of the target of the channel. Corresponds to mz_dataflow_addresses.address.

mz_dataflow_global_ids

The mz_dataflow_global_ids view associates dataflow ids with global ids (ids of the form u8 or t5).

Field Type Meaning
id uint8 The dataflow ID.
global_id text A global ID associated with that dataflow.

mz_dataflow_operators

The mz_dataflow_operators view describes the dataflow operators in the system.

Field Type Meaning
id uint8 The ID of the operator.
name text The internal name of the operator.

mz_dataflow_operator_dataflows

The mz_dataflow_operator_dataflows view describes the dataflow to which each operator belongs.

Field Type Meaning
id uint8 The ID of the operator. Corresponds to mz_dataflow_operators.id.
name text The internal name of the operator.
dataflow_id uint8 The ID of the dataflow hosting the operator. Corresponds to mz_dataflows.id.
dataflow_name text The internal name of the dataflow hosting the operator.

mz_dataflow_operator_parents

The mz_dataflow_operator_parents view describes how dataflow operators are nested into scopes, by relating operators to their parent operators.

Field Type Meaning
id uint8 The ID of the operator. Corresponds to mz_dataflow_operators.id.
parent_id uint8 The ID of the operator’s parent operator. Corresponds to mz_dataflow_operators.id.

mz_dataflow_shutdown_durations_histogram

The mz_dataflow_shutdown_durations_histogram view describes a histogram of the time in nanoseconds required to fully shut down dropped dataflows.

Field Type Meaning
duration_ns uint8 The upper bound of the bucket in nanoseconds.
count numeric The (noncumulative) count of dataflows in this bucket.

mz_expected_group_size_advice

The mz_expected_group_size_advice view provides advice on opportunities to set query hints. Query hints are applicable to dataflows maintaining MIN, MAX, or Top K query patterns. The maintainance of these query patterns is implemented inside an operator scope, called a region, through a hierarchical scheme for either aggregation or Top K computations.

Field Type Meaning
dataflow_id uint8 The ID of the dataflow. Corresponds to mz_dataflows.id.
dataflow_name text The internal name of the dataflow hosting the min/max aggregation or Top K.
region_id uint8 The ID of the root operator scope. Corresponds to mz_dataflow_operators.id.
region_name text The internal name of the root operator scope for the min/max aggregation or Top K.
levels bigint The number of levels in the hierarchical scheme implemented by the region.
to_cut bigint The number of levels that can be eliminated (cut) from the region’s hierarchy.
savings numeric A conservative estimate of the amount of memory in bytes to be saved by applying the hint.
hint double precision The hint value that will eliminate to_cut levels from the region’s hierarchy.

mz_lir_mapping

The mz_lir_mapping view describes the low-level internal representation (LIR) plan that corresponds to global ids. LIR is a higher-level representation than dataflows; this view is used for profiling and debugging indices and materialized views. Note that LIR is not a stable interface and may change at any time. In particular, you should not attempt to parse operator descriptions. LIR nodes are implemented by zero or more dataflow operators with sequential ids. We use the range [operator_id_start, operator_id_end) to record this information. If an LIR node was implemented without any dataflow operators, operator_id_start will be equal to operator_id_end.

Field Type Meaning
global_id text The global ID.
lir_id uint8 The LIR node ID.
operator text The LIR operator, in the format OperatorName INPUTS [OPTIONS].
parent_lir_id uint8 The parent of this LIR node. May be NULL.
nesting uint2 The nesting level of this LIR node.
operator_id_start uint8 The first dataflow operator ID implementing this LIR operator (inclusive).
operator_id_end uint8 The first dataflow operator ID after this LIR operator (exclusive).

mz_message_counts

The mz_message_counts view describes the messages and message batches sent and received over the dataflow channels in the system. It distinguishes between individual records (sent, received) and batches of records (batch_sent, batch_sent).

Field Type Meaning
channel_id uint8 The ID of the channel. Corresponds to mz_dataflow_channels.id.
sent numeric The number of messages sent.
received numeric The number of messages received.
batch_sent numeric The number of batches sent.
batch_received numeric The number of batches received.

mz_peek_durations_histogram

The mz_peek_durations_histogram view describes a histogram of the duration in nanoseconds of read queries (“peeks”) in the dataflow layer.

Field Type Meaning
type text The peek variant: index or persist.
duration_ns uint8 The upper bound of the bucket in nanoseconds.
count numeric The (noncumulative) count of peeks in this bucket.

mz_records_per_dataflow

The mz_records_per_dataflow view describes the number of records in each dataflow.

Field Type Meaning
id uint8 The ID of the dataflow. Corresponds to mz_dataflows.id.
name text The internal name of the dataflow.
records numeric The number of records in the dataflow.
batches numeric The number of batches in the dataflow.
size numeric The utilized size in bytes of the arrangements.
capacity numeric The capacity in bytes of the arrangements. Can be larger than the size.
allocations numeric The number of separate memory allocations backing the arrangements.

mz_records_per_dataflow_operator

The mz_records_per_dataflow_operator view describes the number of records in each dataflow operator in the system.

Field Type Meaning
id uint8 The ID of the operator. Corresponds to mz_dataflow_operators.id.
name text The internal name of the operator.
dataflow_id uint8 The ID of the dataflow. Corresponds to mz_dataflows.id.
records numeric The number of records in the operator.
batches numeric The number of batches in the dataflow.
size numeric The utilized size in bytes of the arrangement.
capacity numeric The capacity in bytes of the arrangement. Can be larger than the size.
allocations numeric The number of separate memory allocations backing the arrangement.

mz_scheduling_elapsed

The mz_scheduling_elapsed view describes the total amount of time spent in each dataflow operator.

Field Type Meaning
id uint8 The ID of the operator. Corresponds to mz_dataflow_operators.id.
elapsed_ns numeric The total elapsed time spent in the operator in nanoseconds.

mz_scheduling_parks_histogram

The mz_scheduling_parks_histogram view describes a histogram of dataflow worker park events. A park event occurs when a worker has no outstanding work.

Field Type Meaning
slept_for_ns uint8 The actual length of the park event in nanoseconds.
requested_ns uint8 The requested length of the park event in nanoseconds.
count numeric The (noncumulative) count of park events in this bucket.
Back to top ↑