August 2024

Upsolver new features, enhancements, and bug fixes for August 2024.

2024.08.26-09.46

⬆️ Enhancements

  • Improved performance of Iceberg compactions

🔧 Bug Fixes

  • Fixed a bug where JSON data files in jobs writing to Snowflake were not being deleted

  • Fixed an issue where using a classic data source in a job caused errors when a field had multiple types during job creation.

2024.08.19-12.07

🔧 Bug Fixes

  • Minor bug fixes

2024.08.15-10.03

⬆️ Enhancements

  • CDC job monitoring enhancements - The monitoring page for Replication (CDC) jobs has been enhanced to improve tracking of table statuses. The page is now divided into two tabs, allowing for more accurate monitoring of each status:

    • Tables in the 'Pending Snapshot' or 'Snapshotting' status can be tracked in the Snapshots tab.

    • Tables running incrementally can be found in the Syncing Tables tab.

  • Utilize JSON as the intermediate format when writing to Redshift.

  • Support for adding new primitive columns when writing to Snowflake Iceberg tables.

🔧 Bug Fixes

  • Iceberg:

    • Schema Evolution: Skip empty field names as query engines do not support them

    • Reading from Iceberg: when reading data from a table after dropping columns from that table, errors could happen, causing delays

    • Fixed a rare case where some task executions would stop running until the server is restarted

    • Fixed an issue where files could have been committed twice to an iceberg table if the server crashed while committing

2024.08.07-11.23

⬆️ Enhancements

  • Polaris Catalog Support

    • You can now configure Polaris Catalog as your default Iceberg Lakehouse catalog and begin ingesting data from databases, streams, and files into a high-performance Iceberg lake:

    • What is Polaris Catalog?

      • Polaris Catalog is an open source under the Apache 2.0 license and available on GitHub. It is offered as Snowflake’s managed service for Polaris Catalog in public preview.

      • Polaris Catalog enables open, secure lakehouse architectures with broad read-and-write interoperability and cross-engine access controls.

  • Support credential vending from Iceberg REST catalogs

    • Previously, connecting to Iceberg REST catalogs required setting your Amazon S3 connection through Upsolver. Now, Upsolver will use the credentials already configured in your catalog for access.

  • Independently create cluster for accounts with multiple VPC connections

  • Configure Snapshot Parallelism for CDC jobs via UI

    • When creating a CDC job, you'll be able to set the snapshot parallelism via the UI.

    • When the CDC job begins, it initially takes a snapshot (full historical load) of each table before loading changes incrementally. Snapshot parallelism allows you to configure the number of snapshots performed concurrently. Increasing the number of concurrent snapshots can speed up the table streaming process. However, higher parallelism also increases the load on the source database.

    • After starting the job, you can adjust the parallelism setting while the job runs. The default parallelism is set to 1.

🔧 Bug Fixes

  • Fixed a bug in jobs writing to a table where using a JOIN expression with an uppercase alias caused the joined row to return nulls in all fields.

  • Fixed job monitoring for jobs writing to Iceberg tables with JOIN expressions.

Last updated