Data Lake Tables

These job options are used when writing to data lake tables or Upsolver-managed tables.

Tables created within Upsolver using your metastore connection are considered Upsolver-managed tables. Note that these tables can still be queried externally. For example, you can create an AWS Glue Data Catalog table within Upsolver, and this table can be queried within Upsolver itself or when you go to your Athena console.

Job options

[ ADD_MISSING_COLUMNS = { TRUE | FALSE } ]
[ AGGREGATION_PARALLELISM = <integer> ]
[ COMMENT = '<comment>' ]
[ COMPUTE_CLUSTER = <cluster_identifier> ]
[ END_AT = { NOW | timestamp } ]
[ FLATTEN_PATHS = (<array_path> [, ...]) ]
[ RUN_INTERVAL = <integer> { MINUTE[S] | HOUR[S] | DAY[S] } ]
[ RUN_PARALLELISM = <integer> ]
[ START_FROM = { NOW | BEGINNING | timestamp } ]

`AGGREGATION_PARALLELISM` — editable

Type: integer

Default: 1

(Optional) Only supported when the query contains aggregations. Formally known as "output sharding."

`ADD_MISSING_COLUMNS`

Type: Boolean

Default: false

(Optional) When true, columns that don't exist in the target table are added automatically when encountered.

When false, you cannot do SELECT * within the SELECT statement of your transformation job.

`FLATTEN_PATHS`

Type: Array<String>

Default: ()

(Optional) Allows specifying arrays that will be used to flatten the output rows. Please see the Flattening Arrays guide for details and examples.

Last updated 1 year ago

Data Lake Tables

Job options

Jump to

`AGGREGATION_PARALLELISM` — editable

`ADD_MISSING_COLUMNS`

`FLATTEN_PATHS`

Job options

Jump to

`AGGREGATION_PARALLELISM` — editable

`ADD_MISSING_COLUMNS`

`FLATTEN_PATHS`

Job options

Jump to

AGGREGATION_PARALLELISM — editable

ADD_MISSING_COLUMNS

FLATTEN_PATHS

Job options

Jump to

AGGREGATION_PARALLELISM — editable

ADD_MISSING_COLUMNS

FLATTEN_PATHS

`AGGREGATION_PARALLELISM` — editable

`ADD_MISSING_COLUMNS`

`FLATTEN_PATHS`

`AGGREGATION_PARALLELISM` — editable

`ADD_MISSING_COLUMNS`

`FLATTEN_PATHS`