VALUE_INDEX_IN_ARRAY
Calculates a 1-based positional index for each non-null value within a sub-array, resetting the index with each new sub-array. This is particularly useful for maintaining context-specific indexes within nested array structures.
Syntax
VALUE_INDEX_IN_ARRAY(array)
Arguments
array
array
Type: any
An array of any value types.
Example
Transformation job example
In this example, column orders contains the following two rows:
[{"products":[{"name":"Apple"},{"name":"Banana"}]},{"products":[{"name":"Cherry"},{"name":"Cherry"}]}]
[{"products":[{"name":"Apple","type":"Fruit"},{"name":"Banana"}]},{"products":[{"name":"Cherry"},{"name":"Cherry"},{"type":"Berry"}]}]
If we use this in the following job, it will output the query result below:
CREATE JOB demo_item_index
RUN_INTERVAL = 1 MINUTE
START_FROM = BEGINNING
ADD_MISSING_COLUMNS = TRUE
AS INSERT INTO default_glue_catalog.upsolver_samples.orders_transformed_data
MAP_COLUMNS_BY_NAME
SELECT
*,
VALUE_INDEX_IN_ARRAY(orders[].products[].name) AS orders[].products[].name_index,
VALUE_INDEX_IN_ARRAY(orders[].products[].type) AS orders[].products[].type_index
FROM default_glue_catalog.upsolver_samples.orders_raw_data
WHERE time_filter();
Query result
The VALUE_INDEX_IN_ARRAY
function calculates the positional index of the value in the products array and returns the following result:
[{"products":
[{"name":"Apple","name_index":1},{"name":"Banana","name_index":2}]},{"products":[{"name":"Cherry","name_index":1},{"name":"Cherry","name_index":2}]
}]
[{"products":
[{"name":"Apple","name_index":1,"type":"Fruit","type_index":1},{"name":"Banana","name_index":2}]},{"products":[{"name":"Cherry","name_index":1},{"name":"Cherry","name_index":2},{"type":"Berry","type_index":1}]}]
Unlike VALUE_INDEX_IN_ROW
, VALUE_INDEX_IN_ARRAY
resets the count for each products array. Each product is a separate context that provides an index for the position of name or type field in that specific array.
Last updated