Search results

    Edit on GitHub

    Aggregations

    Aggregations are functions for grouping multiple events together to form a more significant result.

    Unlike databases, Upsolver runs continuous queries and not ad-hoc queries. Therefore, aggregation results are incrementally updated with every incoming event.

    Aggregation functions require windowing to split a stream into buckets of data that can be aggregated.

    Aggregation functions can return a single value or a hash table as explained in each function.COUNT_ALL(hash_result_key optional)

    Total number of rows within a time window.

    Input parameters:

    • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

    Note:

    • NULL and missing values – not ignored.

    Examples:

    For the following stream of events

    {}
    {"id":"A", "val":1, "val_arr":[1,2]}
    {"id":"B", "val":2, "val_arr":[2,3]}
    {"id":"B", "val":3, "val_arr":[3,4]}
    

    The table below contains possible aggregations

    Aggregation Result
    COUNT_ALL() 4
    COUNT_ALL(id) {"A":1, "B":2}

    COUNT (field, hash_result_key optional)

    Number of times a field appeared within a time window.

    Input parameters:

    • field - field name in a Data Source.
    • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

    Note:

    • NULL and missing values – ignored.
    • Arrays - each item in an array is processed separately.

    Examples:

    For the following stream of events

    {}
    {"id":"A", "val":1, "val_arr":[1,2]}
    {"id":"B", "val":2, "val_arr":[2,3]}
    {"id":"B", "val":3, "val_arr":[3,4]}
    

    The table below contains possible aggregations

    Aggregation Result
    COUNT(val) 3
    COUNT(val_arr) 6
    COUNT(val_arr, id) {"A":2, "B":4}

    COUNT_DISTINCT(field, hash_result_key optional)

    Distinct number of values for a field within a time window.

    Input parameters:

    • field - field name in a Data Source.
    • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

    Note:

    • NULL and missing values – ignored.
    • Arrays - each item in an array is processed separately.

    Example:

    For the following stream of events

    {}
    {"id":"A", "val":1, "val_arr":[1,2]}
    {"id":"B", "val":2, "val_arr":[2,3]}
    {"id":"B", "val":3, "val_arr":[3,4]}
    

    The table below contains possible aggregations

    Aggregation Result
    COUNT_DISTINCT(val) 3
    COUNT_DISTINCT(val_arr) 4
    COUNT_DISTINCT(val_arr {"A":2, "B":3}

    MIN(field, hash_result_key optional)

    Lowest value of a numeric field within a time window.

    Input parameters:

    • field - field name in a Data Source.
    • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

    Note:

    • NULL and missing values – ignored.
    • Arrays - each item in an array is processed separately.

    Example:

    For the following stream of events

    {}
    {"id":"A", "val":1, "val_arr":[1,2]}
    {"id":"B", "val":2, "val_arr":[2,3]}
    {"id":"B", "val":3, "val_arr":[3,4]}
    

    The table below contains possible aggregations

    Aggregation Result
    MIN(val) 1
    MIN(val_arr) 1
    MIN(val_arr, id) {"A":1, "B":2}

    MAX(field, hash_result_key optional)

    Highest value of a numeric field within a time window.

    Input parameters:

    • field - field name in a Data Source.
    • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

    Note:

    • NULL and missing values – ignored.
    • Arrays - each item in an array is processed separately.

    Example:

    For the following stream of events

    {}
    {"id":"A", "val":1, "val_arr":[1,2]}
    {"id":"B", "val":2, "val_arr":[2,3]}
    {"id":"B", "val":3, "val_arr":[3,4]}
    

    The table below contains possible aggregations

    Aggregation Result
    MAX(val) 3
    MAX(val_arr) 4
    MAX(val_arr, id) {"A":2", "B":4}

    SUM(field, hash_result_key optional)

    Sum of values for a numeric field within a time window.

    Input parameters:

    • field - field name in a Data Source.
    • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

    Note:

    • NULL and missing values – ignored.
    • Arrays - each item in an array is processed separately.

    Example:

    For the following stream of events

    {}
    {"id":"A", "val":1, "val_arr":[1,2]}
    {"id":"B", "val":2, "val_arr":[2,3]}
    {"id":"B", "val":3, "val_arr":[3,4]}
    

    The table below contains possible aggregations

    Aggregation Result
    SUM(val) 6
    SUM(val_arr) 15
    SUM((val_arr, id) {"A":3, "B":12}

    AVG(field, hash_result_key optional)

    Average value for a numeric field within a time window.

    Input parameters:

    • field - field name in a Data Source.
    • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

    Note:

    • NULL and missing values – ignored.
    • Arrays - each item in an array is processed separately.

    Example:

    For the following stream of events

    {}
    {"id":"A", "val":1, "val_arr":[1,2]}
    {"id":"B", "val":2, "val_arr":[2,3]}
    {"id":"B", "val":3, "val_arr":[3,4]}
    

    The table below contains possible aggregations

    Aggregation Result
    AVG(val) 2
    AVG(val_arr) 5
    AVG(val_arr, id) {"A":1.5, "B":3}

    LAST(field, hash_result_key optional)

    Last value of a field within a time window.

    Input parameters:

    • field - field name in a Data Source.
    • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

    Note:

    • NULL and missing values – ignored.
    • Arrays - each item in an array is processed separately.

    Example:

    For the following stream of events

    {}
    {"id":"A", "val":1, "val_arr":[1,2]}
    {"id":"B", "val":2, "val_arr":[2,3]}
    {"id":"B", "val":3, "val_arr":[3,4]}
    

    The table below contains possible aggregations

    Aggregation Result
    LAST(val) 3
    LAST(val_arr) 4
    LAST(val_arr, id) {"A":2, "B":4}

    FIRST(field, hash_result_key optional)

    First value of a field within a time window.

    Input parameters:

    • field - field name in a Data Source.
    • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

    Note:

    • NULL and missing values – ignored.
    • Arrays - each item in an array is processed separately.

    Example:

    For the following stream of events

    {}
    {"id":"A", "val":1, "val_arr":[1,2]}
    {"id":"B", "val":2, "val_arr":[2,3]}
    {"id":"B", "val":3, "val_arr":[3,4]}
    

    The table below contains possible aggregations

    Aggregation Result
    FIRST(val) 1
    FIRST(val_arr) 1
    FIRST(val_arr, id) {"A":1, "B":2}

    LAST_K(field, K, hash_result_key optional)

    Last K values of a field within a time window.

    Input parameters:

    • field - field name in a Data Source.
    • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

    Note:

    • NULL and missing values – ignored.
    • Arrays - each item in an array is processed separately.
    • K parameter – max items returned by the aggregation. Default value=1000.

    Example:

    For the following stream of events

    {}
    {"id":"A", "val":1, "val_arr":[1,2]}
    {"id":"B", "val":2, "val_arr":[2,3]}
    {"id":"B", "val":3, "val_arr":[3,4]}
    

    The table below contains possible aggregations

    Aggregation Result
    LAST_K(val,2) [1,2]
    LAST_K(val_arr,2) [3,4]
    LAST_K(val_arr,2, id) {"A":[1,2], "B":[3,4]}

    CONCAT(field, delimiter, hash_result_key optional)

    Concatenating all field values within a time window.

    Input parameters:

    • field - field name in a Data Source.
    • delimiter – text for separating values. Default value – “,”.
    • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

    Note:

    • NULL and missing values – ignored.
    • Arrays - each item in an array is processed separately.
    • K parameter – max items returned by the aggregation. Default value=1000.

    Example:

    For the following stream of events

    {}
    {"id":"A", "val":1, "val_arr":[1,2]}
    {"id":"B", "val":2, "val_arr":[2,3]}
    {"id":"B", "val":3, "val_arr":[3,4]}
    

    The table below contains possible aggregations

    Aggregation Result
    CONCAT(val, “,”) “1,2,3”
    CONCAT(val_arr, “,”) “1,2,2,3,3,4”
    CONCAT(val_arr, “,”, id) {"A":1, "B":2}