July 2024

Upsolver new features, enhancements, and bug fixes for July 2024.

Release Notes Blog

For more detailed information on these updates, check out the Upsolver August 2024 Feature Summary blog.

⬆️ Enhancements

Reduced overhead of task discovery, especially in compute clusters with a lot of assigned entities or large number of shards
Replication Jobs: Added support for configuring the primary key column name for each replication group using the new PRIMARY_KEY_COLUMN property
Cluster: Allow creating a cluster with a STARTUP_SCRIPT and altering cluster's STARTUP_SCRIPT

🔧 Bug Fixes

Jobs reading data from Hive tables could lose data due to committing compacted files in the source table with the wrong timing
Fixed a rare bug where retention failed to delete some files in old partitions
Iceberg: using a column transformation that overrides a field, which appears with different cases in the data, resulted in multiple fields of different types
Fixed a bug that caused job statistics not to be displayed for jobs writing to Iceberg with intervals longer than 1 minute

⬆️ Enhancements

Job Monitoring:
- The job Monitoring page has been redesigned, allowing you to track job statuses by executions, representing each data interval being processed
Apache Iceberg:
- Iceberg Tables:
  - Users can now set table retention based on any column of types DATE, TIMESTAMP, TIMESTAMPZ, LONG, or INTEGER
  - [Breaking change] Retention configuration syntax properties have changed:
  TABLE_DATA_RETENTION -> RETENTION_DURATION
RETENTION_DATE_PARTITION -> RETENTION_COLUMN
- Alter Iceberg Tables:
  - Sorting evolution is now supported for Iceberg tables. This allows users to dynamically change the sorting columns of an existing table. This will affect the data written from now on. New compactions will write data using the new sorting.
  - Partition evolution is now supported for Iceberg tables. This allows users to change the partition columns of an existing table dynamically. This change does not rebuild the existing data according to the new partitioning but will apply the new partitioning to data moving forward.
Performance improvements for high-loaded clusters
Reduced the number of requests to the catalog to prevent rate-exceeded errors when managing a large number of tables

🔧 Bug Fixes

Apache Iceberg:
- Dangling files: Moved backup folder location into table root directory
Amazon Redshift and Snowflake jobs:
- Fixed a bug in the schema evolution (SELECT *) that caused redundant columns to be created when mapping fields to different types, such as mapping a field of type Timestamp to a column of type TimestampTZ
CDC to Amazon Redshift:
- Fixed a bug that caused the creation of columns with incorrect types
Fixed a bug that caused internal files to be deleted before their usage, resulting in job stalling

⬆️ Enhancements

Apache Iceberg:
- Users can now create Iceberg REST catalog connections and add tables
Dynatrace is now supported as a monitoring output

🔧 Bug Fixes

Fixed a bug that caused an additional column to be created when writing a field of type Timestamp to a column of type TimestampTZ in Redshift
Minor bug fixes

⬆️ Enhancements

Apache Iceberg:
- Table Mirror - Users can now define or alter the mirror table refresh interval using MIRROR_INTERVAL
ALTER REPLICATION JOB supports RESNAPSHOT COLLECTION for MongoDB tables

🔧 Bug Fixes

Jobs ingesting from Amazon S3 cannot include START_FROM unless DATE_PATTERN is specified
Fixed a bug that prevented excluding system columns using the EXCLUDE_COLUMNS property in ingestion jobs
Fixed a bug that prevented the use of system columns in column transformations in ingestion jobs
Fixed a bug occurring when the Primary Key is of numeric type and the table is large enough to require chunk splitting
Fixed a bug where it was possible to schedule compactions for partitions in tables without a primary key, even after the retention period, leading to errors

Last updated 1 year ago