July 2024
Upsolver new features, enhancements, and bug fixes for July 2024.
Last updated
Upsolver new features, enhancements, and bug fixes for July 2024.
Last updated
Release Notes Blog
For more detailed information on these updates, check out the Upsolver August 2024 Feature Summary blog.
⬆️ Enhancements
Reduced overhead of task discovery, especially in compute clusters with a lot of assigned entities or large number of shards
Replication Jobs: Added support for configuring the primary key column name for each replication group using the new PRIMARY_KEY_COLUMN
property
Cluster: Allow creating a cluster with a STARTUP_SCRIPT
and altering cluster's STARTUP_SCRIPT
🔧 Bug Fixes
Jobs reading data from Hive tables could lose data due to committing compacted files in the source table with the wrong timing
Fixed a rare bug where retention failed to delete some files in old partitions
Iceberg: using a column transformation that overrides a field, which appears with different cases in the data, resulted in multiple fields of different types
Fixed a bug that caused not to be displayed for jobs writing to Iceberg with intervals longer than 1 minute
⬆️ Enhancements
Job Monitoring:
The job page has been redesigned, allowing you to track job statuses by executions, representing each data interval being processed
Apache Iceberg:
Iceberg Tables:
Users can now set table retention based on any column of types DATE
, TIMESTAMP
, TIMESTAMPZ
, LONG
, or INTEGER
[Breaking change] Retention configuration syntax properties have changed:
TABLE_DATA_RETENTION -> RETENTION_DURATION
RETENTION_DATE_PARTITION -> RETENTION_COLUMN
Alter Iceberg Tables:
is now supported for Iceberg tables. This allows users to dynamically change the sorting columns of an existing table. This will affect the data written from now on. New compactions will write data using the new sorting.
is now supported for Iceberg tables. This allows users to change the partition columns of an existing table dynamically. This change does not rebuild the existing data according to the new partitioning but will apply the new partitioning to data moving forward.
Performance improvements for high-loaded clusters
Reduced the number of requests to the catalog to prevent rate-exceeded errors when managing a large number of tables
🔧 Bug Fixes
Apache Iceberg:
Dangling files: Moved backup folder location into table root directory
Amazon Redshift and Snowflake jobs:
Fixed a bug in the schema evolution (SELECT *
) that caused redundant columns to be created when mapping fields to different types, such as mapping a field of type Timestamp
to a column of type TimestampTZ
CDC to Amazon Redshift:
Fixed a bug that caused the creation of columns with incorrect types
Fixed a bug that caused internal files to be deleted before their usage, resulting in job stalling
⬆️ Enhancements
Apache Iceberg:
🔧 Bug Fixes
Fixed a bug that caused an additional column to be created when writing a field of type Timestamp
to a column of type TimestampTZ
in Redshift
Minor bug fixes
⬆️ Enhancements
Apache Iceberg:
🔧 Bug Fixes
Fixed a bug occurring when the Primary Key is of numeric type and the table is large enough to require chunk splitting
Fixed a bug where it was possible to schedule compactions for partitions in tables without a primary key, even after the retention period, leading to errors
Users can now and add tables
is now supported as a monitoring output
- Users can now define or alter the mirror table refresh interval using MIRROR_INTERVAL
ALTER REPLICATION JOB
supports RESNAPSHOT COLLECTION
for tables
Jobs ingesting from cannot include START_FROM
unless DATE_PATTERN
is specified
Fixed a bug that prevented excluding system columns using the property in ingestion jobs
Fixed a bug that prevented the use of system columns in column transformations in jobs
EXCLUDE_COLUMNS