Source Mapping and Introspection: Debugging Materialize with Materialize

We have a new way to understand the performance of views, indexes, and materialized views in Materialize. By mapping runtime data about low-level dataflow operators up to a sensible intermediate language, you'll be better able to identify and refine computationally expensive parts of your queries.
Databases have a lot of EXPLAINing to do
Databases typically offer some way to understand how queries run. Postgres, for example, has the EXPLAIN statement; running EXPLAIN ... query ... presents the user with a summary of how the plan will be run (what kind of joins, etc.) along with an estimate of the cost.1 Postgres's EXPLAIN ANALYZE statement does one better: it actually runs the query, collecting information as it goes; when the query terminates, it displays the plan, annotated with runtime data like memory used and time spent in each part of the plan.
Materialize has been able to EXPLAIN queries for a long time, but adding in runtime feedback is harder. What would it mean to EXPLAIN ANALYZE CREATE INDEX ...? When should Materialize stop reporting information? Indexes in Materialize don't have an 'end time'! What if you want information about an index that's already running?
We've implemented a new way to glean insights into how your indexes and materialized views are running. To understand how to use it, let's take a quick detour through how Materialize compiles your SQL queries down to dataflows.
Materialize's compilation pipeline
Materialize compiles SQL through a series of intermediate languages: a high-level intermediate language (HIR), a mid-level intermediate language (MIR), and a low-level intermediate language (LIR). A SQL query is translated to an HIR query, which is then translated into one or more MIR queries. Our optimizer does the bulk of its decision making in MIR: planning joins, removing redundancies, and identifying patterns Materialize can run particularly effectively. The compiler then lowers MIR into LIR, our final intermediate representation. LIR is abstract enough to still be a 'high-level' plan, but LIR is low-level enough to explicitly map out all of the details of the plan: how to aggregate, which indexes to use, etc. Having fixed the plan in LIR, translating the LIR to dataflows---the actual runtime engine of Materialize---is straightforward enough.
Materialize already tracks a variety of runtime information in the mz_instrospection schema. But this runtime information is attributed to the dataflow operators that Materialize actually runs. After running our compiler, these dataflow operators don't look anything like the original SQL query! A SQL query might have hundreds of dataflow operators, and it takes real expertise to know which operators correspond to which parts of the query.
To bridge that expertise gap, we've created a source map, called mz_introspection.mz_lir_mapping. We map ranges of dataflow operators up to LIR operators---the fixed plans that are the last intermediate representation in our compiler. Using mz_lir_mapping, you can relate performance statistics---like total computation time and memory usage---to a high-level representation, as seen in our EXPLAIN or Postgres's EXPLAIN ANALYZE.
Mapping dataflow metrics up to LIR
It's easiest to get a feel for what mz_lir_mapping does for you by example. Let's start by generating a sample database tracking which customers bought which products.
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
12 | |
13 | |
14 | |
Since Materialize is deterministic, we'll generate random-feeling data deterministically, using a hash with salt.2
The resulting distribution of purchases across products is fairly uniform:
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
With our toy database populated, let's explore how an analytics query performs: who are the top 5 buyers of the top 5 products? First, let's create some views: one for the top 5 products, one to count purchases of those popular products, and one for the top 5 buyers of each of those top 5 products.
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
12 | |
13 | |
14 | |
15 | |
16 | |
17 | |
18 | |
19 | |
20 | |
21 | |
22 | |
23 | |
24 | |
25 | |
26 | |
Having defined these views, let's index top_buyers by product_id, so our analytics dashboard can quickly look up who the top buyers of our top products are.
1 | |
2 | |
Now that we have the idx_top_buyers index, let's understand its performance using the mz_lir_mapping source mapping.
Attributing memory usage
Materialize's incremental view maintenance trades space for time: we're able to give consistent, up-to-the-minute second (!) answers by caching appropriately. Since caching uses memory, if we're "optimizing a query", then managing memory is the name of the game. Let's combine the new source mapping mz_introspection.mz_lir_mapping (how is our query implemented?) with the metrics data mz_introspection.mz_arrangement_sizes (how much memory are we using?) and the catalog data in mz_catalog.mz_objects (what's defined?) to see how much memory each operator is consuming:
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
12 | |
13 | |
14 | |
The results show information for two high-level objects: the top_buyers view and the idx_top_buyers index. Each of these has a global_id---an internal identifier for that object. Each object has several LIR operators, with most of the operators living in the view.
Each LIR operator represents a high-level part of the plan; you can read more about it in our EXPLAIN docs. Operators form a tree, like expressions in any programming language. We've used mz_lir_mapping's nesting field to indent the operators nicely; ordering by lir_id descending puts the operators in the correct order.3
In mz_lir_mapping, these operators always are a single line, of the form OperatorName c1 c2 ..., where each ci is a "child ID"---the lir_id of one of the inputs of that operator. So TopK::Basic 10 indicates a TopK operator that reads from the operator with lir_id 10 (which, in our example, is a MapFilterProject). The Join::Differential 6 » 7 line indicates a differential join of the inputs in LIR ids 6 and 7.
The size column sums up the sizes in mz_arrangement_sizes for every dataflow operator used to implement a given LIR operator. Looking at the size column above, it seems that the outermost TopK is the expensive one.
It's worth getting a sense of how much runtime data we've just aggregated together when analyzing this relatively simple query. Most LIR operators correspond to quite a few dataflow operators; many have arrangements. Let's adapt our query to count dataflows and arrangements in the top_buyers view:
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
If we tried to attribute memory at the dataflow level to top_buyers, we would have to poke through these hundreds of dataflow operators and dozens of arrangements. But with our source map above, it's easy to get a structured summary of dataflow metrics: the outermost TopK::Basic 10 operator uses the lion's share of memory.
Setting hints for TopK queries
The TopK operator works by building a tower of arrangements of decreasing size: even if you've only asked for the top 5 elements, Materialize can only incrementally maintain the view if somewhere it maintains a complete ranking. (The tower helps us keep latency low and incremental maintenance cheap.) By default, Materialize will allocate eight generously sized levels for the arrangements in a TopK. Our toy example is so small, we're surely wasting a lot of that space. Let's use the LIMIT INPUT GROUP SIZE hint to tell Materialize the expected group size on the input---which informs how tall to make the tower. But what hint should we give?
Materialize already uses runtime data to offer hints on group sizing for existing dataflows, via mz_introspection.mz_expected_group_size_advice. But if we simply take a peek, we'll see that there are two TopK operators (corresponding to TopK::Basic 10 and TopK::Basic 3 in mz_lir_mapping):
1 | |
2 | |
Which TopK corresponds to which LIMIT clause in our query? An astute observer might have a guess from having attributed memory usage; an experienced field engineer might have a guess from the region_id. But with mz_lir_mapping, we don't have to guess:
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
12 | |
13 | |
14 | |
15 | |
16 | |
The outermost TopK---the one responsible for so much memory---should be sized a little larger than the inner one. Making the fix is not so hard: DROP the old definitions and recreate them with the limits in place:
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
12 | |
13 | |
14 | |
15 | |
16 | |
17 | |
18 | |
19 | |
20 | |
21 | |
22 | |
23 | |
24 | |
25 | |
26 | |
27 | |
28 | |
29 | |
Let's rerun our memory attribution query. We'll see a roughly 70% reduction in memory usage for both of the TopK operators:
(Notice that the the global_ids have changed, because we DROPped and recreated the VIEWs and index.)
What's next?
Eventually, we'll build syntax like EXPLAIN ANALYZE INDEX ... around queries like these---once we know which information helps the most. For now, we've documented these and some other common debugging queries; we expect our field engineering team and users to extend these queries and adapt them to their own ends.
There's a separate language design problem, too: what's the right level of abstraction for EXPLAIN? If you run EXPLAIN PLAN on a query today, we give a very detailed static plan, with many lines for each LIR operator. If you query mz_lir_mapping, you'll see a terse, one-line description for each LIR operator. What's the right level of detail? As users get more experience debugging their live queries, we'll get a better sense of what to show and what to hide.
These source maps have already turned arduous, manual tasks that took hours into quick glances that take minutes. Live debugging information makes it much easier to write better queries... so fire up the Materialize emulator and play around with these new features!
Footnotes
- These cost estimates are used in Postgres's query planning, but they should be taken with a grain of salt. ↩
- The values here should be stable across versions of Materialize---seahash values should only change at their major version bumps. ↩
lir_ids number the nodes of the LIR abstract syntax tree in a left-to-right, post-order traversal. ↩
