August 2024
Upsolver new features, enhancements, and bug fixes for August 2024.
Last updated
Upsolver new features, enhancements, and bug fixes for August 2024.
Last updated
⬆️ Enhancements
Improved performance of Iceberg compactions
🔧 Bug Fixes
Fixed duplicate data in jobs to Iceberg. In rare cases, Iceberg would drop delete files prematurely (before compaction), causing old rows to remain in the table
Fixed incorrect information in system table and monitoring page
Fixed a bug where JSON data files in jobs writing to Snowflake were not being deleted
Fixed an issue where using a classic data source in a job caused errors when a field had multiple types during job creation.
🔧 Bug Fixes
Minor bug fixes
⬆️ Enhancements
CDC job monitoring enhancements - The monitoring page for Replication (CDC) jobs has been enhanced to improve tracking of table statuses. The page is now divided into two tabs, allowing for more accurate monitoring of each status:
Tables in the 'Pending Snapshot' or 'Snapshotting' status can be tracked in the Snapshots tab.
Tables running incrementally can be found in the Syncing Tables tab.
Utilize JSON as the intermediate format when writing to Redshift.
Support for adding new primitive columns when writing to Snowflake Iceberg tables.
🔧 Bug Fixes
Iceberg:
Schema Evolution: Skip empty field names as query engines do not support them
Reading from Iceberg: when reading data from a table after dropping columns from that table, errors could happen, causing delays
Fixed a rare case where some task executions would stop running until the server is restarted
Fixed an issue where files could have been committed twice to an iceberg table if the server crashed while committing
⬆️ Enhancements
Polaris Catalog Support
You can now configure Polaris Catalog as your default Iceberg Lakehouse catalog and begin ingesting data from databases, streams, and files into a high-performance Iceberg lake:
What is Polaris Catalog?
Polaris Catalog is an open source under the Apache 2.0 license and available on GitHub. It is offered as Snowflake’s managed service for Polaris Catalog in public preview.
Polaris Catalog enables open, secure lakehouse architectures with broad read-and-write interoperability and cross-engine access controls.
Support credential vending from Iceberg REST catalogs
Previously, connecting to Iceberg REST catalogs required setting your Amazon S3 connection through Upsolver. Now, Upsolver will use the credentials already configured in your catalog for access.
Independently create cluster for accounts with multiple VPC connections
Configure Snapshot Parallelism for CDC jobs via UI
When creating a CDC job, you'll be able to set the snapshot parallelism via the UI.
When the CDC job begins, it initially takes a snapshot (full historical load) of each table before loading changes incrementally. Snapshot parallelism allows you to configure the number of snapshots performed concurrently. Increasing the number of concurrent snapshots can speed up the table streaming process. However, higher parallelism also increases the load on the source database.
After starting the job, you can adjust the parallelism setting while the job runs. The default parallelism is set to 1.
🔧 Bug Fixes
Fixed a bug in jobs writing to a table where using a JOIN
expression with an uppercase alias caused the joined row to return nulls in all fields.
Fixed job monitoring for jobs writing to Iceberg tables with JOIN
expressions.
View the full changes documentation .
Organizations with multiple VPC connections can independently create a cluster. A new parameter, VPC_CONNECTION
, has been added to the command, allowing you to select the relevant VPC connection.
Fixed a bug in jobs where the ON
condition had a different case than the mapped column in the clause.
Fixed a bug when creating with large CSV files to Iceberg tables.