Materialize Documentation
s
Join the Community github/materialize

EXPLAIN

EXPLAIN displays the plan used for a SELECT statement or a view.

WARNING! EXPLAIN is not part of Materialize’s stable interface and is not subject to our backwards compatibility guarantee. The syntax and output of EXPLAIN may change arbitrarily in future versions of Materialize.

Conceptual framework

To execute SELECT statements, Materialize generates a plan consisting of operators that interface with our underlying Differential dataflow engine. EXPLAIN lets you see the plan for a given query, which can provide insight into Materialize’s behavior for specific queries, e.g. performance.

Syntax

EXPLAIN TYPED RAW DECORRELATED OPTIMIZED PLAN FOR select_stmt VIEW view_name
Field Use
TYPED Annotate the plan with column types and unique keys
RAW Display the raw plan
DECORRELATED Display the decorrelated plan
OPTIMIZED (Default) Display the optimized plan
VIEW Display the plan for an existing view

Changed in v0.4.0: Accept EXPLAIN <statement> as shorthand for EXPLAIN PLAN FOR <statement>.

Details

The job of the Materialize planner is to turn SQL code into differential dataflow. We get there via a series of progressively lower-level plans:

SQL -> raw plan -> decorrelated plan -> optimized plan -> dataflow

From SQL to raw plan

In this stage, the planner:

From raw plan to decorrelated plan

In this stage, the planner:

From decorrelated plan to optimized plan

In this stage, the planner:

From optimized plan to dataflow

In this stage, the planner:

For more information on using EXPLAIN, see the details below.

Reading decorrelated/optimized plans

Materialize plans are directed acyclic graphs of operators. Each operator in the graph receives inputs from zero or more other operators and produces a single output.

Most operators have only one input, so to make the displayed plan easier to read we group chains of single-input operators together into a single block of text in the display. Here is a single chain of operators:

%0 =
| Get materialize.public.customer (u15)
| ArrangeBy (#0)

Multiple chains are separated by blank lines.

%0 =
| Get materialize.public.customer (u15)
| ArrangeBy (#0)

%1 =
| Get materialize.public.orders (u18)
| ArrangeBy (#0) (#1)

Each chain is assigned a number eg %1. Operators which have multiple inputs refer to these numbers to specify their inputs. In the next example the Join operator use the outputs of chains %0, %1 and %2 as its inputs.

%0 =
| Get materialize.public.customer (u15)
| ArrangeBy (#0)

%1 =
| Get materialize.public.orders (u18)
| ArrangeBy (#0) (#1)

%2 =
| Get materialize.public.lineitem (u21)
| ArrangeBy (#0)

%3 =
| Join %0 %1 %2 (= #8 #17) (= #0 #9)
| Filter (#6 = "BUILDING"), (#12 < 1995-03-15), (#27 > 1995-03-15)
| Reduce group=(#8, #12, #15)
| | agg sum((#22 * (100dec - #23)))
| Project (#0, #3, #1, #2)

Many operators need to refer to columns in their input. These are displayed like #3 for column number 3. (Columns are numbered starting from column 0).

If you are explaining view creation for a view or one-off query that refers to unmaterialized sources, you may see Filter or Project operators attached to the unmaterialized source. This indicates that those operators will run before the upsert operator, and the upsert operator will not maintain state corresponding to data removed by the filter and project operators.

Here’s an example of a materialized query pulling from unmaterialized sources and the EXPLAIN PLAN command:

CREATE MATERIALIZED VIEW mv as SELECT a, c FROM data EXCEPT ALL SELECT a, c FROM data2 where d is null;

EXPLAIN PLAN FOR VIEW mv;

First, Materialize reads from two unmaterialized sources, u1 and u3. You can see the Filter and Project operators applied during this process.

Source materialize.public.data (u1):
| Project (#0, #2)

Source materialize.public.data2 (u3):
| Filter isnull(#3)
| Project (#0, #2, #3)

Next, Materialize runs the query on the ingested sources. Because the FILTER and PROJECT operators have already been applied during the ingestion, they don’t need be run again.

Query:
%0 =
| Get materialize.public.data (u1)
| Project (#0, #2)

%1 =
| Get materialize.public.data2 (u3)
| Filter isnull(#3)
| Project (#0, #2)
| Negate

%2 =
| Union %0 %1
| Threshold

Each operator can also be annotated with additional metadata. The most common example is the choice of implementation in the Join operator.

%3 =
| Join %0 %1 %2 (= #8 #17) (= #0 #9)
| | implementation = DeltaQuery
| |   delta %0 %1.(#1) %2.(#0)
| |   delta %1 %0.(#0) %2.(#0)
| |   delta %2 %1.(#0) %0.(#0)
| Filter (#6 = "BUILDING"), (#12 < 1995-03-15), (#27 > 1995-03-15)
| Reduce group=(#8, #12, #15) sum((#22 * (100dec - #23)))
| Project (#0, #3, #1, #2)

Finally, a plan can optionally have a finishing action which can sort, limit and project the data. This is executed outside of the dataflow engine, allowing many simple queries to just hit an existing index instead of installing a temporary dataflow.

Finish order_by=(#1 desc, #2 asc) limit=none offset=0 project=(#0..=#3)

Operators in decorrelated and optimized plans

Operator Meaning Example
Constant Always produces the same collection of rows Constant (1)
Get Produces rows from either an existing source/view or from a previous operator in the same plan Get materialize.public.ordered (u2)
Project Produces a subset of the columns in the input rows Project (#2, #3)
Map Appends the results of some scalar expressions to each row in the input Map (((#1 * 10000000dec) / #2) * 1000dec)
FlatMap Appends the result of some table function to each row in the input FlatMap jsonb_foreach(#3)
Filter Remove rows of the input for which some scalar predicates return false Filter (#20 < #21)
Join Returns combinations of rows from each input whenever some scalar predicates are true Join %1 %4 (= #0 #9)
Reduce Groups the input rows by some scalar expressions, reduces each groups using some aggregate functions and produce rows containing the group key and aggregate outputs Reduce group=(#5) countall(null)
TopK Groups the inputs rows by some scalar expressions, sorts each group using the group key, removes the top offset rows in each group and returns the next limit rows TopK group=() order=(#1 asc, #0 desc) limit=5 offset=0
Negate Negates the row counts of the input. This is usually used in combination with union to remove rows from the other union input. Negate
Threshold Removes any rows with negative counts. Threshold
Union Sums the rows counts of both inputs Union %2 %3
ArrangeBy Indicates a point that will become an arrangement in the dataflow engine ArrangeBy (#0) (#3)
Let Marks a branch of computation whose result is used later by other branches with %0. %0 = Let 10 =

Reading raw plans

Raw plans are similar to decorrelated/optimized plans, but may also contain subqueries. For example:

EXPLAIN RAW PLAN FOR SELECT (SELECT 1)
0 =
| Constant ()
| Map select(%1)
| |
| | 1 =
| | | Constant ()
| | | Map 1
| | | Project (#0)
| | | Map
| | | Project (#0)
| |
| Project (#0)
| Map

The select(%1) indicates that the subquery block %1 below is run for each row in the input. The outputs of each run are unioned together and then passed to the select operator.

Inside a subquery, scalar expressions can refer to columns of the outside rows by adding a ^ to the column number eg #^1.

Operators in raw plans

Operator Meaning Example
Constant Always produces the same collection of rows Constant (1)
Get Produces rows from either an existing source/view or from a previous operator in the same plan Get materialize.public.ordered (u2)
Project Produces a subset of the columns in the input rows Project (#2, #3)
Map Appends the results of some scalar expressions to each row in the input Map (((#1 * 10000000dec) / #2) * 1000dec)
CallTable Appends the result of some table function to each row in the input CallTable generate_series (#3)
Filter Remove rows of the input for which some scalar predicates return false Filter (#20 < #21)
Join Perform one of INNER / LEFT / RIGHT / FULL OUTER on the two inputs, using the given predicate InnerJoin %0 %1 on (#1 = #2)
Reduce Groups the input rows by some scalar expressions, reduces each group using some aggregate functions and produce rows containing the group key and aggregate outputs. In the case where the group key is empty and the input is empty, returns a single row with the aggregate functions applied to empty rows. Reduce group=(#5) agg countall(null)
Distinct Remove duplicate copies of input rows Distinct
TopK Groups the inputs rows by some scalar expressions, sorts each group using the group key, removes the top offset rows in each group and returns the next limit rows TopK group=() order=(#1 asc, #0 desc) limit=5 offset=0
Negate Negates the row counts of the input. This is usually used in combination with union to remove rows from the other union input. Negate
Threshold Removes any rows with negative counts. Threshold
Union Sums the rows counts of both inputs Union %2 %3
Did this info help?
Yes No