July 2024

Upsolver new features, enhancements, and bug fixes for July 2024.

Release Notes Blog

For more detailed information on these updates, check out the Upsolver August 2024 Feature Summary blog.

2024.07.30-06.58

⬆️ Enhancements

  • Reduced overhead of task discovery, especially in compute clusters with a lot of assigned entities or large number of shards

  • Replication Jobs: Added support for configuring the primary key column name for each replication group using the new PRIMARY_KEY_COLUMN property

  • Cluster: Allow creating a cluster with a STARTUP_SCRIPT and altering cluster's STARTUP_SCRIPT

🔧 Bug Fixes

  • Jobs reading data from Hive tables could lose data due to committing compacted files in the source table with the wrong timing

  • Fixed a rare bug where retention failed to delete some files in old partitions

  • Iceberg: using a column transformation that overrides a field, which appears with different cases in the data, resulted in multiple fields of different types

  • Fixed a bug that caused job statistics not to be displayed for jobs writing to Iceberg with intervals longer than 1 minute

2024.07.29-08.06

⬆️ Enhancements

  • Job Monitoring:

    • The job Monitoring page has been redesigned, allowing you to track job statuses by executions, representing each data interval being processed

  • Apache Iceberg:

    • Iceberg Tables:

      • Users can now set table retention based on any column of types DATE, TIMESTAMP, TIMESTAMPZ, LONG, or INTEGER

      • [Breaking change] Retention configuration syntax properties have changed:

      TABLE_DATA_RETENTION -> RETENTION_DURATION

    RETENTION_DATE_PARTITION -> RETENTION_COLUMN

    • Alter Iceberg Tables:

      • Sorting evolution is now supported for Iceberg tables. This allows users to dynamically change the sorting columns of an existing table. This will affect the data written from now on. New compactions will write data using the new sorting.

      • Partition evolution is now supported for Iceberg tables. This allows users to change the partition columns of an existing table dynamically. This change does not rebuild the existing data according to the new partitioning but will apply the new partitioning to data moving forward.

  • Performance improvements for high-loaded clusters

  • Reduced the number of requests to the catalog to prevent rate-exceeded errors when managing a large number of tables

🔧 Bug Fixes

  • Apache Iceberg:

    • Dangling files: Moved backup folder location into table root directory

  • Amazon Redshift and Snowflake jobs:

    • Fixed a bug in the schema evolution (SELECT *) that caused redundant columns to be created when mapping fields to different types, such as mapping a field of type Timestamp to a column of type TimestampTZ

  • CDC to Amazon Redshift:

    • Fixed a bug that caused the creation of columns with incorrect types

  • Fixed a bug that caused internal files to be deleted before their usage, resulting in job stalling

2024.07.11-12.30

⬆️ Enhancements

🔧 Bug Fixes

  • Fixed a bug that caused an additional column to be created when writing a field of type Timestamp to a column of type TimestampTZ in Redshift

  • Minor bug fixes

2024.07.08-11.10

⬆️ Enhancements

  • Apache Iceberg:

    • Table Mirror - Users can now define or alter the mirror table refresh interval using MIRROR_INTERVAL

  • ALTER REPLICATION JOB supports RESNAPSHOT COLLECTION for MongoDB tables

🔧 Bug Fixes

  • Jobs ingesting from Amazon S3 cannot include START_FROM unless DATE_PATTERN is specified

  • Fixed a bug that prevented the use of system columns in column transformations in ingestion jobs

  • Fixed a bug occurring when the Primary Key is of numeric type and the table is large enough to require chunk splitting

  • Fixed a bug where it was possible to schedule compactions for partitions in tables without a primary key, even after the retention period, leading to errors

Last updated