Last value in group
Overview
The “last value in each group” query pattern returns the last value, according to some ordering, in each group.
Materialize and window functions
For window functions, when an input record
in a partition (as determined by the PARTITION BY
clause of your window
function) is added/removed/changed, Materialize recomputes the results for the
entire window partition. This means that when a new batch of input data arrives
(that is, every second), the amount of computation performed is proportional
to the total size of the touched partitions.
For example, assume that in a given second, 20 input records change, and these records belong to 10 different partitions, where the average size of each partition is 100. Then, amount of work to perform is proportional to computing the window function results for 10*100=1000 rows.
As a rule of thumb, if the total size of all touched window partitions is at most 1000000 rows per second, then the system should be able to keep up with the input data as it arrives. However, if your use case has higher performance requirements, consider rewriting your query to not use window functions. If your query cannot be rewritten without the window functions and the performance of window functions is insufficient for your use case, please contact our team.
Idiomatic Materialize SQL
Idiomatic Materialize SQL: To find the last value in each group, use the MIN() or MAX() aggregate function in a subquery.
Use a subquery that uses the MIN() or MAX() aggregate function.
|
|
NOTE: Materialize does not support
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING .
|
Query hints
To further improve the memory usage of the idiomatic Materialize SQL, you can
specify a AGGREGATE INPUT GROUP SIZE
query hint in
the idiomatic Materialize SQL.
SELECT tableA.fieldA, tableA.fieldB, minmax.Z
FROM tableA,
(SELECT fieldA,
MAX(fieldZ),
MIN(fieldZ)
FROM tableA
OPTIONS (AGGREGATE INPUT GROUP SIZE = ...)
GROUP BY fieldA) minmax
WHERE tableA.fieldA = minmax.fieldA
ORDER BY fieldA ... ;
For more information on setting AGGREGATE INPUT GROUP SIZE
, see
Optimization.
Examples
Use MAX() to find the last value
Using idiomatic Materialize SQL, the following example finds the highest item
price in each order and calculates the difference between the price of each item
in the order and the highest price. The example uses a subquery that groups by
the order_id
and selects MAX(price)
to find the
highest price (i.e., the last price if ordered by ascending price values):
|
|
NOTE: Materialize does not support
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING .
|
Use MIN() to find the last values
Using idiomatic Materialize SQL, the following example finds the lowest item
price in each order and calculates the difference between the price of each item
in the order and the lowest price. That is, use a subquery that groups by the
order_id
and selects MIN(price)
as the lowest price
(i.e., last price if ordered by descending price value)
|
|
NOTE: Materialize does not support
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING .
|
Use MIN() and MAX() to find the last values
Using idiomatic Materialize SQL, the following example finds the lowest and
highest item price in each order and calculate the difference for each item in
the order from these prices. That is, use a subquery that groups by the
order_id
and selects MIN(price)
as the lowest price
(i.e., last value if ordered by descending price values) and
MAX(price)
as the highest price (i.e., last value if
ordered by ascending price values).
|
NOTE: Materialize does not support
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING .
|