ClickHouse
Job options
Jump to
General job options:
Prerequisites
Ensure your ClickHouse connection is set up and accessible
Pre-define your target ClickHouse table, keeping in mind that schema evolution is not currently supported. Fields that might not always have values should be marked as nullable. Learn more about nullable fields in ClickHouse.
Currently tested with ClickHouse Server Version 24.1
Creating a target table in ClickHouse
Before executing your job, the target table in ClickHouse needs to be predefined.
Example:
Supported ClickHouse data types
Upsolver supports writing to the following defined types:
Basic Types:
Integers (Int8, Int16, Int32, Int64, UInt8, UInt16, UInt32, UInt64)
Floating-Point Numbers (Float32, Float64)
Decimal Numbers (Decimal32, Decimal64, Decimal128)
Date and Time (Date, DateTime, DateTime64)
Strings (String, FixedString)
Bool
UUID
Enums (Enum8, Enum16)
LowCardinality
Complex Types:
Arrays
Tuples
Maps
Parallelism support
To prevent data duplication during retries, we utilize a deduplication token that relies on the "deduplication window" setting. This setting must be defined for target tables and specifies the number of the most recently inserted blocks for which hash sums are stored. This mechanism restricts the number of concurrent insert operations by checking for duplicates using these hash sums.
The higher the deduplication window number, the greater the concurrency that can be safely supported, ensuring both data integrity and operational efficiency.
Required Setting:
Non-Replicated Tables:
non_replicated_deduplication_window
Replicated Tables:
replicated_deduplication_window
Upsert support
Selecting an engine such as ReplacingMergeTree
, CollapsingMergeTree
, or VersionedCollapsingMergeTree
enables ClickHouse to perform upserts, where duplicates are merged based on the table's sorting key. While this ensures data integrity by preventing duplicate entries, it may limit parallelism as data must be inserted in sequence to maintain order.
Examples
Ingestion job
Transformation job
Last updated