COLLECT_SET
A set of all values related to a field.
Syntax
COLLECT_SET([MAX VALUES,] VALUE)
Arguments
MAX VALUES:
The maximum number of entries that can be counted (default: 2,147,483,647).
MAX VALUES
is optional. If it is omitted there is, in effect, no limit.
VALUE:
The field that can be counted.
Returns
An ARRAY of the argument type.
Notes
The order of elements in the array is non-deterministic. NULL values are excluded.
Example
Data
[
{
"GROUP_ID":"G1",
"USER_ID":"U1",
"CONNECTION_TIME":"2020-06-26 02:31:29,573"
},
{
"GROUP_ID":"G1",
"USER_ID":"U2",
"CONNECTION_TIME":"2020-06-26 18:11:45,783"
},
{
"GROUP_ID":"G2",
"USER_ID":"Z1",
"CONNECTION_TIME":"2020-06-26 23:54:27,687"
},
{
"GROUP_ID":"G2",
"CONNECTION_TIME":"2020-07-26 23:54:27,687"
},
{
"GROUP_ID":"G1",
"USER_ID":"U2",
"CONNECTION_TIME":"2021-07-01 02:31:29,573"
},
{
"GROUP_ID":"G1",
"USER_ID":"U1",
"CONNECTION_TIME":"2021-07-01 18:11:45,783"
},
{
"GROUP_ID":"G2",
"USER_ID":"Z1",
"CONNECTION_TIME":"2021-07-01 23:54:27,687"
}
]
Query
List the distinct entries for each unique group.
SET
DATE_TIME = TO_DATE(data.CONNECTION_TIME);
SELECT
COLLECT_SET(data.USER_ID) AS collect_set_data_group_id:STRING,
data.GROUP_ID AS group_id:STRING
FROM
"SAMPLE_DATA_G1U1 - json"
GROUP BY
data.GROUP_ID
Results
{
"collect_set_data_group_id":[
"Z1"
],
"group_id":"G2"
}{
"collect_set_data_group_id":[
"U1",
"U2"
],
"group_id":"G1"
}
Related Functions
Dialog
Last updated
Was this helpful?