SimpleAggregateFunction Type
Description
The SimpleAggregateFunction
data type stores the intermediate state of an
aggregate function, but not its full state as the AggregateFunction
type does.
This optimization can be applied to functions for which the following property holds:
the result of applying a function
f
to a row setS1 UNION ALL S2
can be obtained by applyingf
to parts of the row set separately, and then again applyingf
to the results:f(S1 UNION ALL S2) = f(f(S1) UNION ALL f(S2))
.
This property guarantees that partial aggregation results are enough to compute
the combined one, so we do not have to store and process any extra data. For
example, the result of the min
or max
functions require no extra steps to
calculate the final result from the intermediate steps, whereas the avg
function
requires keeping track of a sum and a count, which will be divided to get the
average in a final Merge
step which combines the intermediate states.
Aggregate function values are commonly produced by calling an aggregate function
with the -SimpleState
combinator appended to the function name.
Syntax
Parameters
aggregate_function_name
- The name of an aggregate function.Type
- Types of the aggregate function arguments.
Supported functions
The following aggregate functions are supported:
any
anyLast
min
max
sum
sumWithOverflow
groupBitAnd
groupBitOr
groupBitXor
groupArrayArray
groupUniqArrayArray
groupUniqArrayArrayMap
sumMap
minMap
maxMap
Values of the SimpleAggregateFunction(func, Type)
have the same Type
,
so unlike with the AggregateFunction
type there is no need to apply
-Merge
/-State
combinators.
The SimpleAggregateFunction
type has better performance than the AggregateFunction
for the same aggregate functions.