Upsolver SQLake
Search…
⌃K

Content type options

When reading in your data, additional options can be configured for the following content types:

CSV

CONTENT_TYPE = (
TYPE = CSV
INFER_TYPES = { TRUE | FALSE }
[ HEADER = '<header>' ]
[ DELIMITER = '<delimiter>' ]
[ QUOTE_ESCAPE_CHAR = '<char>' ]
[ NULL_VALUE = '<null_value>' ]
[ MAX_COLUMNS = <integer> ]
[ ALLOW_DUPLICATE_HEADERS = { TRUE | FALSE } ]
)

INFER_TYPES

Type: boolean
(Optional) When true, each column's data types are inferred as one of the following types: string, integer, double, boolean.
When false, all data is treated as a string.
Type: string
Default: Empty string
(Optional) When the header is within the data itself, this option does not need to be required.
However, if the data contains no header, then it is required to provide it here.

DELIMITER

Type: text
Default: ,
(Optional) The delimiter used for columns in the CSV file

QUOTE_ESCAPE_CHAR

Type: text
Default: "
(Optional) Defines the character used for escaping quotes inside an already quoted value.

NULL_VALUE

Type: text
(Optional) Values in the CSV that match the provided value are interpreted as null.

MAX_COLUMNS

Type: integer
(Optional) The number of columns to allocate when reading a row. Note that larger values may perform poorly.

ALLOW_DUPLICATE_HEADERS

Type: boolean
Default: false
(Optional) When true, repeat headers are allowed. Numeric suffixes are added for disambiguation.

TSV

CONTENT_TYPE = (
TYPE = TSV
INFER_TYPES = { TRUE | FALSE }
[ HEADER = '<header>' ]
[ NULL_VALUE = '<null_value>' ]
[ MAX_COLUMNS = <integer> ]
[ ALLOW_DUPLICATE_HEADERS = { TRUE | FALSE } ]
)

INFER_TYPES

Type: boolean
(Optional) When true, each column's data types are inferred as one of the following types: string, integer, double, boolean.
When false, all data is treated as a string.

HEADER

Type: string
Default: Empty string
(Optional) When the header is within the data itself, this option does not need to be required.
However, if the data contains no header, then it is required to provide it here.

NULL_VALUE

Type: text
(Optional) Values in the TSV that match the provided value are interpreted as null.

MAX_COLUMNS

Type: integer
(Optional) The number of columns to allocate when reading a row. Note that larger values may perform poorly.

ALLOW_DUPLICATE_HEADERS

Type: boolean
Default: false
(Optional) When true, repeat headers are allowed. Numeric suffixes are added for disambiguation.

JSON

CONTENT_TYPE = (
TYPE = JSON
[ SPLIT_ROOT_ARRAY = { TRUE | FALSE } ]
[ STORE_JSON_AS_STRING = { TRUE | FALSE } ]
)

SPLIT_ROOT_ARRAY

Type: boolean
Default: true
(Optional) When true, a root object that is an array is parsed as separate events. When false, it is parsed as a single event which contains only an array.

STORE_JSON_AS_STRING

Type: boolean
Default: false
(Optional) When true, a copy of the original JSON is stored as a string value in an additional column.

SCHEMA_REGISTRY

Note that only Avro schemas are currently supported.
CONTENT_TYPE = (
TYPE = SCHEMA_REGISTRY
SCHEMA_REGISTRY_URL = '<url>'
)

SCHEMA_REGISTRY_URL

Type: text
The URL of the getting schema where {id} is the ID of the schema.

FIXED_WIDTH

CONTENT_TYPE = (
TYPE = FIXED_WIDTH
[ COLUMNS = ( (COLUMN_NAME = '<column_name>'
START_INDEX = <integer>
END_INDEX = <integer>) [,...] ) ]
[ INFER_TYPES = { TRUE | FALSE } ]
)

COLUMNS

Type: list
(Optional) An array of the name, start index, and end index for each column in the file.

INFER_TYPES

Type: boolean
Default: false
(Optional) When true, each column's data types are inferred. When false, all data is treated as a string.

REGEX

See: Java Pattern
CONTENT_TYPE = (
TYPE = REGEX
[ PATTERN = '<pattern>' ]
[ MULTILINE = { TRUE | FALSE } ]
[ INFER_TYPES = { TRUE | FALSE } ]
)

PATTERN

Type: text
(Optional) The pattern to match against the input. Named groups are extracted from the data.

MULTILINE

Type: boolean
Default: false
(Optional) When true, the pattern is matched against the whole input. When false, it is matched against each line of the input.

INFER_TYPES

Type: boolean
Default: false
(Optional) When true, each column's data types are inferred. When false, all data is treated as a string.

SPLIT_LINES

CONTENT_TYPE = (
TYPE = SPLIT_LINES
PATTERN = '<pattern>'
)

PATTERN

Type: text
(Optional) A regular expression pattern to split the data by. If left empty, the data is split by lines.

XML

CONTENT_TYPE = (
TYPE = XML
[ STORE_ROOT_AS_STRING = { TRUE | FALSE } ]
)

STORE_ROOT_AS_STRING

Type: boolean
Default: false
(Optional) When true, a copy of the XML is stored as a string in an additional column.