First value in group
Overview
The “first value in each group” query pattern returns the first value, according to some ordering, in each group.
Materialize and window functions
For window functions, when an input record
in a partition (as determined by the PARTITION BY
clause of your window
function) is added/removed/changed, Materialize recomputes the results for the
entire window partition. This means that when a new batch of input data arrives
(that is, every second), the amount of computation performed is proportional
to the total size of the touched partitions.
For example, assume that in a given second, 20 input records change, and these records belong to 10 different partitions, where the average size of each partition is 100. Then, amount of work to perform is proportional to computing the window function results for 10*100=1000 rows.
As a rule of thumb, if the total size of all touched window partitions is at most 1000000 rows per second, then the system should be able to keep up with the input data as it arrives. However, if your use case has higher performance requirements, consider rewriting your query to not use window functions. If your query cannot be rewritten without the window functions and the performance of window functions is insufficient for your use case, please contact our team.
Idiomatic Materialize SQL
Idiomatic Materialize SQL: To find the first value in each group, use MIN() or MAX() aggregate function in a subquery.
Use a subquery that uses the MIN() or MAX() aggregate function.
|
|
|
Query hints
To further improve the memory usage of the idiomatic Materialize SQL, you can
specify a AGGREGATE INPUT GROUP SIZE
query hint in
the idiomatic Materialize SQL.
SELECT tableA.fieldA, tableA.fieldB, minmax.Z
FROM tableA,
(SELECT fieldA,
MIN(fieldZ),
MAX(fieldZ)
FROM tableA
OPTIONS (AGGREGATE INPUT GROUP SIZE = ...)
GROUP BY fieldA) minmax
WHERE tableA.fieldA = minmax.fieldA
ORDER BY fieldA ... ;
For more information on setting AGGREGATE INPUT GROUP SIZE
, see
Optimization.
Examples
Use MIN() to find the first value
Using idiomatic Materialize SQL, the following example finds the lowest item
price in each order and calculates the difference between the price of each item
in the order and the lowest price. The example uses a subquery that groups by
the order_id
and selects MIN(price)
to find the lowest price (i.e., first
value if ordered by ascending price values).
|
|
|
Use MAX() to find the first value
Using idiomatic Materialize SQL, the following example finds the highest item
price in each order and calculates the difference between the price of each item
in the order and the highest price. The example uses a subquery that groups by
the order_id
and selects MAX(price)
to find the highest price (i.e., first
value if ordered by descending price values).
|
|
|
Use MIN() and MAX() to find the first values
Using idiomatic Materialize SQL, the following example finds the lowest and the
highest item price in each order and calculates the difference between each item
in the order and these prices. The example uses a subquery that groups by the
order_id
and selects MIN(price)
as the lowest price (i.e., first
value if ordered by price values) and MAX(price)
as the
highest price (i.e., first
value if ordered by descending price values)
|
|
|