September 2024

Upsolver new features, enhancements, and bug fixes for September 2024.

⬆️ Enhancements

Multi Job now supports reading from Partitioned Iceberg Table
Replication Jobs: Implemented deletion of internal files once they are no longer used
Sync Jobs Reading from Iceberg Tables: Reduced the number of scanned blocks and rows by leveraging statistics in the Iceberg metadata for more efficient reads
Multi Job now supports using partitioned Iceberg tables as a source

🔧 Bug Fixes

API: Fixed wrong validation when creating or editing Output that writes Parquet files
Minor Bug fixes

⬆️ Enhancements

Pause Job Support for Iceberg Table Targets
- Pause Job functionality is now fully supported for all jobs writing to Iceberg table targets. Previously, pausing was not available for jobs targeting Iceberg tables, but with this update, users can now pause and resume these jobs as needed, providing greater flexibility and control over long-running data operations.
Introduced a retry mechanism when committing data to the Iceberg table from a job, specifically handling cases where the table is modified by another process

🔧 Bug Fixes

Fixed an issue where auto-sharding tasks could potentially fail on NullPointerException
Fixed a bug in CDC jobs where empty unrelated system columns were added to the target tables

⬆️ Enhancements

Pause Job Support for Iceberg Table Targets
- Pause Job functionality is now fully supported for all jobs writing to Iceberg table targets. Previously, pausing was not available for jobs targeting Iceberg tables, but with this update, users can now pause and resume these jobs as needed, providing greater flexibility and control over long-running data operations.
Introduced a retry mechanism when committing data to the Iceberg table from a job, specifically handling cases where the table is modified by another process

🔧 Bug Fixes

Fixed an issue where auto-sharding tasks could potentially fail on NullPointerException
Fixed a bug in CDC jobs where empty unrelated system columns were added to the target tables

⬆️ Enhancements

Iceberg Partition Clustering
- New Partition Clustering Feature: You can now efficiently manage large datasets partitioned on high-cardinality columns using partition clustering. This feature optimizes storage by merging small files, improving performance, reducing query times, and minimizing S3 API costs.
- Improved Query Performance: By clustering partitions and reducing the number of small files, full table scans and data refresh processes are significantly faster
- When to Use: Partition clustering is ideal for datasets with high cardinality, frequent data arrival, and skewed data distribution.
- How to Use: When creating a table with partition clustering, use the CLUSTERED BY clause instead of PARTITIONED BY.
- Please see the complete documentation for more details, including usage scenarios, limitations, and syntax options.
Data Lineage Enhancements
- Improved Visual Distinction: Previously, job source tables and lookup tables (materialized views) had similar visual representations, leading to confusion. We’ve enhanced the clarity by differentiating the arrows between jobs and materialized views from those from source tables.
- Additional UX Improvements: Various user experience enhancements have been made to further improve the overall workflow and usability.
Support adding tables to the STOPPED_TABLES list in replication jobs

🔧 Bug Fixes

SQL Server CDC: Parse columns of type DateTime2 as Timestamp
Fixed Expire Snapshots, leaving dangling files
Using LIMIT with a sync job reading from Iceberg caused the job never to process data

⬆️ Enhancements

🔧 Bug Fixes

Fixed duplicate data in jobs to Iceberg. In rare cases, Iceberg would drop delete files prematurely (before compaction), causing old rows to remain in the table
Fixed incorrect information in system table and monitoring page
Fixed a bug where JSON data files in jobs writing to Snowflake were not being deleted
Fixed an issue where using a classic data source in a job caused errors when a field had multiple types during job creation.

Last updated 1 month ago