VALUE_INDEX_IN_ROW

Calculates a continuous, 1-based positional index for each value in an array, treating the entire row as a single context. It assigns a unique index to each non-null element across the arrays within the row.

Syntax

VALUE_INDEX_IN_ROW(array)

Arguments

`array`

Type: any

An array of any value types.

Example

Transformation job example

In this example, the orders column contains the following two rows:

[{"products":[{"name":"Apple"},{"name":"Banana"}]},{"products":[{"name":"Cherry"},{"name":"Cherry"}]}]
[{"products":[{"name":"Apple","type":"Fruit"},{"name":"Banana"}]},{"products":[{"name":"Cherry"},{"name":"Cherry"},{"type":"Berry"}]}]

If we use this in the following job, it will output the query result below:

Transformation job example

SQL

CREATE JOB index_products
   RUN_INTERVAL = 1 MINUTE
   START_FROM = BEGINNING
   ADD_MISSING_COLUMNS = TRUE
AS INSERT INTO default_glue_catalog.upsolver_samples.orders_transformed_data
  MAP_COLUMNS_BY_NAME
   SELECT
    *,
    VALUE_INDEX_IN_ROW(orders[].products[].name) AS orders[].products[].name_index,
    VALUE_INDEX_IN_ROW(orders[].products[].type) AS orders[].products[].type_index
  FROM default_glue_catalog.upsolver_samples.orders_raw_data 
  WHERE time_filter();

Query result

The function VALUE_INDEX_IN_ROW ("name_index") returns a unique index for each name field within the row, regardless of which products object it belongs to, and the VALUE_INDEX_IN_ROW ("type_index") returns a unique index for the type field.

Now the value of orders is:

[{"products":
[{"name":"Apple","name_index":1},{"name":"Banana","name_index":2}]},{"products":[{"name":"Cherry","name_index":3},{"name":"Cherry","name_index":4}]
}]
[{"products":
[{"name":"Apple","name_index":1,"type":"Fruit","type_index":1},"name":"Banana","name_index":2}]},{"products":[{"name":"Cherry","name_index":3},{"name":"Cherry","name_index":4},{"type":"Berry","type_index":2}]
}]

VALUE_INDEX_IN_ROW calculates the positional index of the value in the context of its own row by assigning the value to the same array containing the items we index. In the above example, where we have products inside orders), it will assign a unique index for each name and type.

Two similar rows have been added in order to demonstrate the behavior of nulls in records that don’t contain type.

The index is agnostic to the specific products object in which the name and type field is found, and provides a continuous count of name and type fields within the row. We can think of it as a global index in the row.

Last updated 1 year ago