We are pleased to announce the availability of incrementally maintained recursive SQL queries in Materialize for public preview.
WITH MUTUALLY RECURSIVE is now available in all Materialize environments.
Materialize excels at incrementally maintaining up-to-date results of SQL queries as source data changes over time. Today, we introduce support for recursive SQL queries, allowing you to express and run iterative computations that are maintained incrementally.
SQL’99 introduced the very useful common table expressions (CTEs).
A common table expression allows you to use the
WITH clause to name some expressions and then use them multiple times in your query, without resorting to copy/paste.
Recursive CTEs go a step further and allow you to write queries with CTEs that reference themselves, allowing you to formulate queries that perform the same computation until convergence.
Many important problems require iterative computation and can be handled entirely in the database instead of the application layer with the help of recursive CTEs. Let’s look at an example of what we are releasing today and discuss when it might be useful to you.
Recursive CTEs in Materialize
WITH MUTUALLY RECURSIVE flavor of recursive CTEs that is now publicly available in Materialize remains unchanged from our original proposal.
The blog post explains why and how we decided to depart from the SQL’99 standard.
To illustrate how recursive CTEs work in Materialize, consider the following high-level definition of the transitive closure
T(x int, y int) of a binary relation
R(x int, y int):
- For every combination of values
(x,y)must be also in
- For every combination of values
(y,z)are both in
(x,z)must be also in
Translating this definition into a recursive SQL query in Materialize is a straight-forward mapping where each case corresponds to exactly one
UNION branch in the recursive definition of
WITH MUTUALLY RECURSIVE
T(x int, y int) AS (
SELECT x, y FROM R
SELECT x, z FROM T as t1(x, y) JOIN T as t2(y, z) USING(y)
SELECT * FROM T
Recursive CTEs in Materialize are evaluated as if the CTE contents are fully replaced by the result of the CTE definition in every iteration until they stop changing. As a developer you can just assume this intuitive “bulk update” mental model when you formulate your queries. As usual, under the hood Materialize will maintain the computation incrementally — even across iterative updates of your recursive CTE definitions: we will do the heavy-lifting for you as we compile and execute your SQL queries.
WITH RECURSIVE queries,
WITH MUTUALLY RECURSIVE accepts any valid
SELECT query as a recursive CTE definition.
For example, the above query is not accepted by Postgres because the recursive CTE
T is referenced twice in the recursive case.
A compatible formulation is possible, but might take longer to converge.
Expressing a recursive CTE that is compliant with the SQL’99 standard is often a challenge.
WITH MUTUALLY RECURSIVE does not impose syntactic constraints on your recursive CTEs, this process is much easier in Materialize.
All details about the syntax and semantics of recursive
WITH MUTUALLY RECURSIVE blocks as well as some practical examples can be found in the reference docs.
Now that you’ve learned about the distinct flavor of recursive CTEs that Materialize offers, you might have some practical questions before you give them a shot.
How to identify use cases for recursive CTEs?
Use cases for recursive CTEs usually arise naturally in situations where the following conditions are met:
- Your data can be viewed or organized as a network (a graph) or as a hierarchy (a tree).
- You want to compute information that explains your data within the context of the above structure.
- Your business can benefit from having always-fresh, accurate, and consistent results of the above computation.
How to write recursive CTEs?
WITH MUTUALLY RECURSIVE definitions can be freely composed:
you can chain them sequentially or even nest them in each other — much like the loop constructs of any other programming language!
As a consequence, algorithms defined in terms of one or more (possibly nested) loops that update collections of data points until convergence can be mapped almost directly to
WITH MUTUALLY RECURSIVE blocks and recursive CTEs.
How are results maintained incrementally?
What is the secret sauce that makes recursive CTEs tick incrementally in Materialize? The answer is — it’s the very same secret sauce that we use elsewhere! The Materialize runtime is built on top of Differential Dataflow — a programming framework that allows users to incrementally maintain computations changing both over time and over multiple iterations. Differential Dataflow has always been capable of efficient incremental maintenance of iterative dataflows — recursive CTEs are the mechanism by which we expose this power to Materialize users!
Recursive CTEs allow you to express iterative computations using the
WITH MUTUALLY RECURSIVE CTE syntax.
Materialize maintains the results of these computations incrementally, keeping your results up to date as data comes in from your source systems.
Check out our reference docs on recursive CTEs if you are interested in getting started and trying them out yourself.