COLLECT_SET

A set of all values related to a field.

Syntax

COLLECT_SET([MAX VALUES,] VALUE)

Arguments

MAX VALUES: The maximum number of entries that can be counted (default: 2,147,483,647). ‌ MAX VALUES is optional. If it is omitted there is, in effect, no limit. VALUE: The field that can be counted.

Returns

An ARRAY of the argument type.

Notes

The order of elements in the array is non-deterministic. NULL values are excluded.

Example

Data

[
   {
      "GROUP_ID":"G1",
      "USER_ID":"U1",
      "CONNECTION_TIME":"2020-06-26 02:31:29,573"
   },
   {
      "GROUP_ID":"G1",
      "USER_ID":"U2",
      "CONNECTION_TIME":"2020-06-26 18:11:45,783"
   },
   {
      "GROUP_ID":"G2",
      "USER_ID":"Z1",
      "CONNECTION_TIME":"2020-06-26 23:54:27,687"
   },
   {
      "GROUP_ID":"G2",
      "CONNECTION_TIME":"2020-07-26 23:54:27,687"
   },
   {
      "GROUP_ID":"G1",
      "USER_ID":"U2",
      "CONNECTION_TIME":"2021-07-01 02:31:29,573"
   },
   {
      "GROUP_ID":"G1",
      "USER_ID":"U1",
      "CONNECTION_TIME":"2021-07-01 18:11:45,783"
   },
   {
      "GROUP_ID":"G2",
      "USER_ID":"Z1",
      "CONNECTION_TIME":"2021-07-01 23:54:27,687"
   }
]

Query

List the distinct entries for each unique group.

SET
   DATE_TIME = TO_DATE(data.CONNECTION_TIME);
SELECT
   COLLECT_SET(data.USER_ID) AS collect_set_data_group_id:STRING,
   data.GROUP_ID AS group_id:STRING 
FROM
   "SAMPLE_DATA_G1U1 - json" 
GROUP BY
   data.GROUP_ID

Results

{
   "collect_set_data_group_id":[
      "Z1"
   ],
   "group_id":"G2"
}{
   "collect_set_data_group_id":[
      "U1",
      "U2"
   ],
   "group_id":"G1"
}

COLLECT_SET_EACH

Dialog

Last updated