Aggregations are functions for grouping multiple events together to form a more significant result.
Unlike databases, Upsolver runs continuous queries rather than adhoc queries. With Upsolver aggregation results are updated incrementally with every incoming event.
Aggregation functions require windowing to split a stream into buckets of data that can be aggregated.
The approximate number of distinct values in the time window.
Use this function instead of COUNT_DISTINCT
to improve performance, but only when there are not many (under 1M) rows in the result.
The approximate count of distinct values per group in the time window.
The average value in the time window.
The average value in the time window grouped by the given key.
The average of the values per time interval.
A set of all values encountered in the time interval.
A set of all values encountered in the time interval grouped by the given key.
The number of values in the time window.
For the following stream of events:
{"id": "1", "data": 2}{"id": "1", "data": 3}{"id": "2", "data": 5}{"id": "3", "data": 8}
Using this aggregation with primary key id
and COUNT(data)
produces the following data:
Primary Key 

1  2 
2  1 
3  1 
The number of items in the time window.
Counts the number of distinct values that appeared in the column per key value.
For the following stream of events:
{"id": "1", "data": "a"}{"id": "1", "data": "b"}{"id": "2", "data": "c"}{"id": "2", "data": "c"}{"id": "3", "data": "c"}
Using this aggregation with primary key id
and COUNT_DISTINCT(data)
produces the following data:
Primary Key 

1  2 
2  1 
3  1 
The number of items grouped by the given key.
The number of true
values in the time window.
COUNT_IF (expr)
expr
: a BOOLEAN expression that can be either a calculated
field or a column from the data streams
INT
Contents of Data Stream:
{"type": "event","id": "1", data" : "sample data", "extendeddata": "apple"}{"type": "event","id": "2", data" : "sample data", "extendeddata": "watermelon"}{"type": "event","id": "3", data" : "sample data", "extendeddata": "cucumber"}{"type": "event","id": "4", data" : "sample data", "extendeddata": "Strawberry"}
Query:
SELECT type,COUNT_IF(data = 'sample data') AS data,COUNT_IF(data = 'sample data' AND extendeddata='apple') AS applesFROM streamGROUP BY type
Results:
type  data  apples 
event  4  1 
â€‹COUNTâ€‹
â€‹COUNT(*)â€‹
â€‹COUNT_DISTINCTâ€‹
â€‹COUNT_EACHâ€‹
Performs a sum on the value and decays that sum based on the decay factor and how old the original data is.
Delete the record when this is set.
Gets the current sessions, where a session is defined as events separated by no more than windowSize
time. If two window sizes are passed, the larger one is used.
The first value in the time window.
For the following stream of events:
{"id": "1", "data": 3}{"id": "1", "data": 2}{"id": "2", "data": 5}{"id": "3", "data": 8}
Using this aggregation with primary key id
and SUM(data)
produces the following data:
Primary Key 

1  3 
2  5 
3  8 
The first array of values in the time window.
The first value per group.
The first value per time interval.
The last value in the time window.
For the following stream of events:
{"id": "1", "data": 3}{"id": "1", "data": 2}{"id": "2", "data": 5}{"id": "3", "data": 8}
Using this aggregation with primary key id
and LAST(data)
produces the following data:
Primary Key 

1  2 
2  5 
3  8 
The last array of values in the time window.
The last value per group.
The last k
values.
The last k
values per group.
The last value per interval.
The maximum value in the time window.
For the following stream of events:
{"id": "1", "data": 2}{"id": "1", "data": 3}{"id": "2", "data": 5}{"id": "3", "data": 8}
Using this aggregation with primary key id
and MAX(data)
produces the following data:
Primary Key 

1  3 
2  5 
3  8 
The value correlating to the maximum sort in the time window.
The maximum value per group.
The maximum value per time interval.
The minimum value in the time window.
For the following stream of events:
{"id": "1", "data": 2}{"id": "1", "data": 3}{"id": "2", "data": 5}{"id": "3", "data": 8}
Using this aggregation with primary key id
and MIN(data)
produces the following data:
Primary Key 

1  2 
2  5 
The value correlating to the minimum sort in the time window.
The minimum value per group.
The minimum value per time interval.
Stores the number of sessions.
A session includes all the events in the aggregation that are at most windowSize
apart in their value.
For the following stream of events:
{"id": "John", "time": 1}{"id": "John", "time": 4}{"id": "John", "time": 2}{"id": "John", "time": 7}
Using this aggregation with primary key id
and SESSION_COUNT(time, 2)
will produce the following data:
Primary Key 

"John"  2 
With SESSION_COUNT(time, 2)
, a session is defined as a distance between two events of up to or including 2. Thus, 1, 2, and 4 are one session and 7 is the second session.
The standard deviation of values in the time window.
The standard deviation of the value per group.
The maximum string value in the time window sorted casesensitive lexicographically.
Stores the maximum string value per group, sorted casesensitive lexicographically.
The minimum string value in the time window sorted casesensitive lexicographically.
Stores the minimum string value per group, sorted casesensitive lexicographically.
The sum of the values in the time window.
For the following stream of events:
{"id": "1", "data": 2}{"id": "1", "data": 3}{"id": "2", "data": 5}{"id": "3", "data": 8}
Using this aggregation with primary key id
and SUM(data)
produces the following data:
Primary Key 

1  5 
2  5 
3  8 
The sum of the values per group.
The sum of the values per time interval.
The weighted average of a value in the time window.