Big changes in EXPLAIN: new syntax and EXPLAIN ANALYZE 🔎
06.26.2025
We've made some big changes to how Materialize EXPLAINs itself.
New EXPLAIN format
First, we've redesigned the output of EXPLAIN to help users focus on the most important parts of their queries when optimizing performance. Materialize power users should find the new format more concise and easier to understand. In the name of making things accessible to non-power users, we've adopted syntactic conventions that look more like Postgres's output:
1 | |
2 | |
3 | |
4 | |
Yields the following plan:
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
12 | |
13 | |
14 | |
The *** full scan *** indicates that we'll need to fully scan the t_a_idx index. Looking at the plan, the Arrange on the second input of the join creates a new, local arrangement corresponding to t.b. If we add an index on b, we'll get a much better plan:
1 | |
2 | |
Yields the following, much better plan, where both uses of t will be Arranged, i.e., we won't be creating any local in-memory arrangements---we're able to re-use existing indices.
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
Concretely, the new format uses physical plans (which gives a more direct accounting of where memory is consumed by arrangements) rather than the higher-level plans we used to report (which are our optimizer's primary internal representation).
A quick reminder: the EXPLAIN syntax isn't a stable interface, and we'll be improving on the output here over time. We'd be happy to hear your feedback!
EXPLAIN ANALYZE your indexes and materialized views
We've already taken advantage of Materialize's "introspection sources" to build tools for understanding long-running dataflows, like indexes and materialized views, with a blogpost documenting the possibility. Now, we've made a top-level SQL statement that lets you take advantage of these queries without having to write a lick of SQL yourself: EXPLAIN ANALYZE.
Under the hood, EXPLAIN ANALYZE generates SQL that reports on your long-running dataflows. For example, you can EXPLAIN ANALYZE MEMORY FOR ... an index or materialized view to see total memory consumption and the number of records, like so:
The new statement supports explaining MEMORY and CPU; it can also break things down by worker (WITH SKEW), allowing you to see if your data are skewed. Skew happens when work is unevenly distributed across workers, leading to higher latency as some workers having to deal with much more than their fair share of data while other workers twiddle their thumbs. Finally, you can also EXPLAIN ANALYZE HINTS FOR ... to see hints on sizing TopK forms:
Since EXPLAIN ANALYZE is just running SQL queries under the hood, you can EXPLAIN ANALYZE ... AS SQL to see the query that it runs---which you can then tune to your liking. Read more about EXPLAIN ANALYZE in the docs!