Kafka options
CREATE [SYNC] JOB <job_name>
[{ job_options }]
AS COPY FROM KAFKA
<connection_identifier>
[{ source_options }]
INTO <table_identifier>;
[ CONSUMER_PROPERTIES = '<properties>' ]
[ READER_SHARDS = <integer> ]
[ STORE_RAW_DATA = { TRUE | FALSE } ]
[ START_FROM = { NOW | BEGINNING } ]
[ END_AT = { NOW | <timestamp> } ]
[ COMPUTE_CLUSTER = <cluster_identifier> ]
[ RUN_PARALLELISM = <integer> ]
[ CONTENT_TYPE = { AUTO
| CSV
| JSON
| PARQUET
| TSV
| AVRO
| AVRO_SCHEMA_REGISTRY
| FIXED_WIDTH
| REGEX
| SPLIT_LINES
| ORC
| XML } ]
[ COMPRESSION = { AUTO
| GZIP
| SNAPPY
| LZO
| NONE
| SNAPPY_UNFRAMED
| KCL } ]
[ COMMENT = '<comment>' ]
Type:
text_area
(Optional) Additional properties to use when configuring the consumer. This overrides any settings in the Kafka connection.
Type:
integer
Default:
1
(Optional) Determines how many readers are used in parallel to read the stream.
This number does not need to equal your number of partitions in Kafka.
A recommended value would be to increase it by 1 for every 70 MB/s in sent to your topic.
Type:
boolean
Default:
false
(Optional) When
true
, stores an additional copy of the data in its original format.Values:
{ NOW | BEGINNING }
Default:
BEGINNING
(Optional) Configures the time to start ingesting data from. Files before the specified time are ignored.
Values:
{ NOW | <timestamp> }
Default: Never
(Optional) Configures the time to stop ingesting data. Files after the specified time are ignored. Timestamps provided should be based on UTC and in the following format:
TIMESTAMP 'YYYY-MM-DD HH:MM:SS'
.Type:
identifier
Default: The sole cluster in your environment
(Optional) The compute cluster to run this job.
This option can only be omitted when there is just one cluster in your environment.
Once you have more than one compute cluster, you are required to provide which one to use through this option.
Type:
integer
Default:
1
(Optional) The number of parser jobs to run in parallel per minute.
Values:
{ AUTO | CSV | JSON | PARQUET | TSV | AVRO | AVRO_SCHEMA_REGISTRY | FIXED_WIDTH | REGEX | SPLIT_LINES | ORC | XML }
Default:
AUTO
(Optional) The file format of the content being read.
Note that
AUTO
only works when reading Avro, JSON, or Parquet.Values:
{ AUTO | GZIP | SNAPPY | LZO | NONE | SNAPPY_UNFRAMED | KCL }
Default:
AUTO
(Optional) The compression of the source.
Type:
text
(Optional) A description or comment regarding this job.
TOPIC_NAME = '<topic_name>'
Type:
text
The topic to read from.
Last modified 3mo ago