Data lake tables

This page describes the job options for writing data to the data lake.

These job options are used when writing to data lake tables or Upsolver-managed tables.

Tables created within SQLake using your metastore connection are considered Upsolver-managed tables. Note that these tables can still be queried externally. For example, you can create a Glue Catalog table within SQLake, and this table can be queried within SQLake itself or when you go to your Athena console.

Syntax

[ ADD_MISSING_COLUMNS = { TRUE | FALSE } ]
[ AGGREGATION_PARALLELISM = <integer> ]
[ COMMENT = '<comment>' ]
[ COMPUTE_CLUSTER = <cluster_identifier> ]
[ END_AT = { NOW | timestamp } ]
[ FLATTEN_PATHS = (<array_path> [, ...]) ]
[ RUN_INTERVAL = <integer> { MINUTE[S] | HOUR[S] | DAY[S] } ]
[ RUN_PARALLELISM = <integer> ]
[ START_FROM = { NOW | BEGINNING | timestamp } ]

Jump to

Data lake table options:

General job options:

Job options

ADD_MISSING_COLUMNS

Type: Boolean

Default: false

(Optional) When true, columns that don't exist in the target table are added automatically when encountered.

When false, you cannot do SELECT * within the SELECT statement of your transformation job.

FLATTEN_PATHS

Type: Array<String>

Default: ()

(Optional) Allows specifying arrays that will be used to flatten the output rows. Please see the Flattening arrays guide for details and examples.

Last updated