Lead over

Overview

The “lead over” query pattern accesses the field value of the next row as determined by some ordering.

For “lead over (order by)” queries whose ordering can be represented by some equality condition (such as when ordering by a field that increases at a regular interval), Materialize provides an idiomatic SQL as an alternative to the window function.

Materialize and window functions

For window functions, when an input record in a partition (as determined by the PARTITION BY clause of your window function) is added/removed/changed, Materialize recomputes the results for the entire window partition. This means that when a new batch of input data arrives (that is, every second), the amount of computation performed is proportional to the total size of the touched partitions.

For example, assume that in a given second, 20 input records change, and these records belong to 10 different partitions, where the average size of each partition is 100. Then, amount of work to perform is proportional to computing the window function results for 10*100=1000 rows.

As a rule of thumb, if the total size of all touched window partitions is at most 1000000 rows per second, then the system should be able to keep up with the input data as it arrives. However, if your use case has higher performance requirements, consider rewriting your query to not use window functions. If your query cannot be rewritten without the window functions and the performance of window functions is insufficient for your use case, please contact our team.

Idiomatic Materialize SQL

! Important: Do not use if the “lead over (order by)” ordering cannot be represented by an equality match.

Exclude the last row in results

Idiomatic Materialize SQL: To access the lead (next row’s field value) ordered by some field that increases in regular intervals, use a self join that specifies an equality condition on the order by field (e.g., WHERE t1.order_field = t2.order_field - 1, WHERE t1.order_field = t2.order_field * 2, etc.). The query excludes the last row in the results since it does not have a next row.

Idiomatic Materialize SQL

Use a self join that specifies an equality match on the lead’s order by field (e.g., fieldA). The order by field must increment in a regular pattern in order to be represented by an equality condition (e.g., WHERE t1.fieldA = t2.fieldA - ...). The query excludes the last row in the results since it does not have a next row.

! Important: The idiomatic Materialize SQL applies only to those “lead over” queries whose ordering can be represented by some equality condition.

-- Excludes the last row in the results --
SELECT t1.fieldA, t2.fieldB as next_row_value
FROM tableA t1, tableA t2
WHERE t1.fieldA = t2.fieldA - ...  -- or some other operand
ORDER BY fieldA;
Anti-pattern

Avoid the use of LEAD(fieldZ) OVER (ORDER BY ...) window function when the order by field increases in a regular pattern.


-- Anti-pattern. Avoid. --
SELECT fieldA, ...
    LEAD(fieldZ) OVER (ORDER BY fieldA) as next_row_value
FROM tableA;

Include the last row in results

Idiomatic Materialize SQL: To access the lead (next row’s field value) ordered by some field that increases in regular intervals, use a self LEFT JOIN/LEFT OUTER JOIN that specifies an equality condition on the order by field (e.g., ON t1.order_field = t2.order_field - 1, ON t1.order_field = t2.order_field * 2, etc.). The LEFT JOIN/LEFT OUTER JOIN query includes the last row, returning null as its lead value.

Idiomatic Materialize SQL

Use a self LEFT JOIN/LEFT OUTER JOIN (e.g., FROM tableA t1 LEFT JOIN tableA t2) that specifies an equality match on the lag’s order by field (e.g., fieldA). The order by field must increment in a regular pattern in order to be represented by an equality condition (e.g., ON t1.fieldA = t2.fieldA - ...). The query includes the last row, returning null as its lead value.

! Important: The idiomatic Materialize SQL applies only to those “lead over” queries whose ordering can be represented by some equality condition.
-- Includes the last row in the response --
SELECT t1.fieldA, t2.fieldB as next_row_value
FROM tableA t1
LEFT JOIN tableA t2
ON t1.fieldA = t2.fieldA - ... -- or some other operand
ORDER BY fieldA;
Anti-pattern

Avoid the use of LEAD(fieldZ) OVER (ORDER BY ...) window function when the order by field increases in regular intervals.


SELECT fieldA, ...
    LEAD(fieldZ) OVER (ORDER BY fieldA) as next_row_value
FROM tableA;

Examples

NOTE: The example data can be found in the Appendix.

Find next row’s value (exclude the last row in results)

Using idiomatic Materialize SQL, the following example finds the next day’s order total. That is, the example uses a self join on orders_daily_totals. The row ordering on the order_date field is represented by an equality condition using an interval of 1 DAY. The query excludes the last row in the results since the last row does not have a next row.

Materialize SQL
-- Excludes the last row in results --
SELECT o1.order_date, o1.daily_total,
    o2.daily_total as next_daily_total
FROM orders_daily_totals o1, orders_daily_totals o2
WHERE o1.order_date = o2.order_date - INTERVAL '1' DAY
ORDER BY order_date;
! Important: The idiomatic Materialize SQL applies only to those “lead over” queries whose ordering can be represented by some equality condition.
Anti-pattern

Avoid the use of LEAD() OVER (ORDER BY ...) window function to access next row’s value if the order by field increases in regular intervals.


-- Anti-pattern. Includes the last row's value. --
SELECT order_date, daily_total,
    LEAD(daily_total) OVER (ORDER BY order_date) as next_daily_total
FROM orders_daily_totals;

Find next row’s value (include the last row in results)

Using idiomatic Materialize SQL, the following example finds the next day’s order total. The example uses a self LEFT JOIN/LEFT OUTER JOIN on orders_daily_totals. The row ordering on the order_date field is represented by an equality condition using an interval of 1 DAY). The query includes the last row in the results, using null as the next row’s value.

Materialize SQL
-- Include the last row in the results --
SELECT o1.order_date, o1.daily_total,
    o2.daily_total as next_daily_total
FROM orders_daily_totals o1
LEFT JOIN orders_daily_totals o2
ON o1.order_date = o2.order_date - INTERVAL '1' DAY
ORDER BY order_date;
! Important: The idiomatic Materialize SQL applies only to those “lead over” queries whose ordering can be represented by some equality condition.
Anti-pattern

Avoid the use of LEAD() OVER (ORDER BY ...) window function to access next row’s value if the order by field increases in a regular pattern.


-- Anti-pattern. Includes the last row in results. --
SELECT order_date, daily_total,
    LEAD(daily_total) OVER (ORDER BY order_date) as next_daily_total
FROM orders_daily_totals;

See also

Back to top ↑