UNNEST
The
UNNEST
operator is useful for converting nested objects with arrays to flat tables. With Upsolver, UNNEST
allows you to flatten arrays based on a full SELECT
statement. In certain cases, using
UNNEST
may produce a Cartesian product of the column's array values in your result. This means that the values in the flattened arrays will appear in every possible combination within your result.For example. given the arrays
[ 1, 2, 3 ]
and [ 4, 5 ]
, their Cartesian product would be [ (1 , 4), (1, 5), (2, 4), (2, 5), (3, 4), (3, 5) ]
.If this is the desired result, enable the job option
ALLOW_CARTESIAN_PRODUCTS
to allow the flattening of the arrays.{
"values": [ 1, 2, 3 ],
"name": "Oleg",
"id": 123
}
UNNEST(SELECT data.values[] as value,
data.name as name,
data.id as id
FROM my_data_source)
Since the data was flattened based on the
values[]
array that contained three values, the result contains three rows from one source event.value | name | id |
---|---|---|
1 | Oleg | 123 |
2 | Oleg | 123 |
3 | Oleg | 123 |
{
"values": ["apple", "NY"],
"type": ["fruit", "city"]
}
UNNEST(SELECT data.values[] as value,
data.type[] as type
FROM my_data_source)
Since the selected arrays are independent without any shared context, the result contains a Cartesian product.
value | type |
---|---|
apple | fruit |
NY | fruit |
apple | city |
NY | city |
In this case, we can see that it doesn't make sense to pair
NY
with fruit
and apple
with city
, so the ZIP
function should be used to first combine the arrays into a single context.UNNEST(SELECT zipped[].value as value
zipped[].type as type
FROM my_data_source
LET zipped = ZIP('type,value', data.type[], data.values[]))
Now we have achieved our desired output for our data.
value | type |
---|---|
apple | fruit |
NY | city |
{
"orders": [{ "order_lines": [ 1, 2, 3 ], "name": "a" },
{ "order_lines": [ 4, 5, 6 ], "name": "b" }]
"refunds": [ 1, 2 ]
}
UNNEST(SELECT orders[].name as name,
orders[].order_lines[] as line
FROM my_data_source)
Since there is a natural pairing between
orders[].name
and orders[].order_lines[]
where each name
has a corresponding order_lines
array, using UNNEST
on this query does not result in a Cartesian product.name | line |
---|---|
a | 1 |
a | 2 |
a | 3 |
b | 4 |
b | 5 |
b | 6 |
UNNEST(SELECT orders[].name as name,
orders[].order_lines[] as line,
refunds[] as refund
FROM my_data_source)
Since there is no relationship between
orders[]
and refunds[]
in our data, using UNNEST
on this query will result in a Cartesian product between the result of pairing orders[]
and refunds[]
.name | line | refund |
---|---|---|
a | 1 | 1 |
a | 2 | 1 |
a | 3 | 1 |
b | 4 | 1 |
b | 5 | 1 |
b | 6 | 1 |
a | 1 | 2 |
a | 2 | 2 |
a | 3 | 2 |
b | 4 | 2 |
b | 5 | 2 |
b | 6 | 2 |
{
"orders": [{ "order_lines": [ 1, 2, 3 ],
"name": "a",
"order_date": [ "07/30/2021", "11/27/2021" ]},
{ "order_lines": [ 4, 5, 6 ],
"name": "b",
"order_date": [ "03/21/2021", "09/13/2021" ]}]
}
UNNEST(SELECT orders[].name as name,
orders[].order_lines[] as line
FROM my_data_source)
Since there is a natural pairing between
orders[].name
and orders[].order_lines[]
where each name
has a corresponding order_lines
array, using UNNEST
on this query does not result in a Cartesian product.name | line |
---|---|
a | 1 |
a | 2 |
a | 3 |
b | 4 |
b | 5 |
b | 6 |
UNNEST(SELECT orders[].name as name,
orders[].order_lines[] as line,
orders[].order_date[] as order_date
FROM my_data_source)
While
orders[].order_lines[]
and orders[].order_date[]
are both part of orders[]
, there is no natural pairing between the values in the two respective arrays. As such, using UNNEST
on this query will result in a Cartesian product.name | line | order_date |
---|---|---|
1 | a | 07/30/2021 |
2 | a | 07/30/2021 |
3 | a | 07/30/2021 |
1 | a | 11/27/2021 |
2 | a | 11/27/2021 |
3 | a | 11/27/2021 |
4 | b | 03/21/2021 |
5 | b | 3/21/2021 |
6 | b | 3/21/2021 |
4 | b | 09/13/2021 |
5 | b | 09/13/2021 |
6 | b | 09/13/2021 |
Last modified 7mo ago