Aggregations

Aggregations are functions for grouping multiple events together to form a more significant result.

Unlike databases, Upsolver runs continuous queries and not ad-hoc queries. Therefore, aggregation results are incrementally updated with every incoming event.

Aggregation functions require windowing to split a stream into buckets of data that can be aggregated.

Aggregation functions can return a single value or a hash table as explained in each function.COUNT_ALL(hash_result_key optional)

Total number of rows within a time window.

Input parameters:

  • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

Note:

  • NULL and missing values – not ignored.

Examples:

For the following stream of events

{}
{"id":"A", "val":1, "val_arr":[1,2]}
{"id":"B", "val":2, "val_arr":[2,3]}
{"id":"B", "val":3, "val_arr":[3,4]}

The table below contains possible aggregations

Aggregation Result
COUNT_ALL() 4
COUNT_ALL(id) {"A":1, "B":2}

COUNT (field, hash_result_key optional)

Number of times a field appeared within a time window.

Input parameters:

  • field - field name in a Data Source.
  • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

Note:

  • NULL and missing values – ignored.
  • Arrays - each item in an array is processed separately.

Examples:

For the following stream of events

{}
{"id":"A", "val":1, "val_arr":[1,2]}
{"id":"B", "val":2, "val_arr":[2,3]}
{"id":"B", "val":3, "val_arr":[3,4]}

The table below contains possible aggregations

Aggregation Result
COUNT(val) 3
COUNT(val_arr) 6
COUNT(val_arr, id) {"A":2, "B":4}

COUNT_DISTINCT(field, hash_result_key optional)

Distinct number of values for a field within a time window.

Input parameters:

  • field - field name in a Data Source.
  • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

Note:

  • NULL and missing values – ignored.
  • Arrays - each item in an array is processed separately.

Example:

For the following stream of events

{}
{"id":"A", "val":1, "val_arr":[1,2]}
{"id":"B", "val":2, "val_arr":[2,3]}
{"id":"B", "val":3, "val_arr":[3,4]}

The table below contains possible aggregations

Aggregation Result
COUNT_DISTINCT(val) 3
COUNT_DISTINCT(val_arr) 4
COUNT_DISTINCT(val_arr {"A":2, "B":3}

MIN(field, hash_result_key optional)

Lowest value of a numeric field within a time window.

Input parameters:

  • field - field name in a Data Source.
  • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

Note:

  • NULL and missing values – ignored.
  • Arrays - each item in an array is processed separately.

Example:

For the following stream of events

{}
{"id":"A", "val":1, "val_arr":[1,2]}
{"id":"B", "val":2, "val_arr":[2,3]}
{"id":"B", "val":3, "val_arr":[3,4]}

The table below contains possible aggregations

Aggregation Result
MIN(val) 1
MIN(val_arr) 1
MIN(val_arr, id) {"A":1, "B":2}

MAX(field, hash_result_key optional)

Highest value of a numeric field within a time window.

Input parameters:

  • field - field name in a Data Source.
  • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

Note:

  • NULL and missing values – ignored.
  • Arrays - each item in an array is processed separately.

Example:

For the following stream of events

{}
{"id":"A", "val":1, "val_arr":[1,2]}
{"id":"B", "val":2, "val_arr":[2,3]}
{"id":"B", "val":3, "val_arr":[3,4]}

The table below contains possible aggregations

Aggregation Result
MAX(val) 3
MAX(val_arr) 4
MAX(val_arr, id) {"A":2", "B":4}

SUM(field, hash_result_key optional)

Sum of values for a numeric field within a time window.

Input parameters:

  • field - field name in a Data Source.
  • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

Note:

  • NULL and missing values – ignored.
  • Arrays - each item in an array is processed separately.

Example:

For the following stream of events

{}
{"id":"A", "val":1, "val_arr":[1,2]}
{"id":"B", "val":2, "val_arr":[2,3]}
{"id":"B", "val":3, "val_arr":[3,4]}

The table below contains possible aggregations

Aggregation Result
SUM(val) 6
SUM(val_arr) 15
SUM((val_arr, id) {"A":3, "B":12}

AVG(field, hash_result_key optional)

Average value for a numeric field within a time window.

Input parameters:

  • field - field name in a Data Source.
  • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

Note:

  • NULL and missing values – ignored.
  • Arrays - each item in an array is processed separately.

Example:

For the following stream of events

{}
{"id":"A", "val":1, "val_arr":[1,2]}
{"id":"B", "val":2, "val_arr":[2,3]}
{"id":"B", "val":3, "val_arr":[3,4]}

The table below contains possible aggregations

Aggregation Result
AVG(val) 2
AVG(val_arr) 5
AVG(val_arr, id) {"A":1.5, "B":3}

LAST(field, hash_result_key optional)

Last value of a field within a time window.

Input parameters:

  • field - field name in a Data Source.
  • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

Note:

  • NULL and missing values – ignored.
  • Arrays - each item in an array is processed separately.

Example:

For the following stream of events

{}
{"id":"A", "val":1, "val_arr":[1,2]}
{"id":"B", "val":2, "val_arr":[2,3]}
{"id":"B", "val":3, "val_arr":[3,4]}

The table below contains possible aggregations

Aggregation Result
LAST(val) 3
LAST(val_arr) 4
LAST(val_arr, id) {"A":2, "B":4}

FIRST(field, hash_result_key optional)

First value of a field within a time window.

Input parameters:

  • field - field name in a Data Source.
  • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

Note:

  • NULL and missing values – ignored.
  • Arrays - each item in an array is processed separately.

Example:

For the following stream of events

{}
{"id":"A", "val":1, "val_arr":[1,2]}
{"id":"B", "val":2, "val_arr":[2,3]}
{"id":"B", "val":3, "val_arr":[3,4]}

The table below contains possible aggregations

Aggregation Result
FIRST(val) 1
FIRST(val_arr) 1
FIRST(val_arr, id) {"A":1, "B":2}

LAST_K(field, K, hash_result_key optional)

Last K values of a field within a time window.

Input parameters:

  • field - field name in a Data Source.
  • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

Note:

  • NULL and missing values – ignored.
  • Arrays - each item in an array is processed separately.
  • K parameter – max items returned by the aggregation. Default value=1000.

Example:

For the following stream of events

{}
{"id":"A", "val":1, "val_arr":[1,2]}
{"id":"B", "val":2, "val_arr":[2,3]}
{"id":"B", "val":3, "val_arr":[3,4]}

The table below contains possible aggregations

Aggregation Result
LAST_K(val,2) [1,2]
LAST_K(val_arr,2) [3,4]
LAST_K(val_arr,2, id) {"A":[1,2], "B":[3,4]}

CONCAT(field, delimiter, hash_result_key optional)

Concatenating all field values within a time window.

Input parameters:

  • field - field name in a Data Source.
  • delimiter – text for separating values. Default value – “,”.
  • hash_result_key parameter – used to return a hash table, mapping each key to an aggregation value.

Note:

  • NULL and missing values – ignored.
  • Arrays - each item in an array is processed separately.
  • K parameter – max items returned by the aggregation. Default value=1000.

Example:

For the following stream of events

{}
{"id":"A", "val":1, "val_arr":[1,2]}
{"id":"B", "val":2, "val_arr":[2,3]}
{"id":"B", "val":3, "val_arr":[3,4]}

The table below contains possible aggregations

Aggregation Result
CONCAT(val, “,”) “1,2,3”
CONCAT(val_arr, “,”) “1,2,2,3,3,4”
CONCAT(val_arr, “,”, id) {"A":1, "B":2}

results matching ""

    No results matching ""