September 2024

Upsolver new features, enhancements, and bug fixes for September 2024.

2024.09.10-07.45

⬆️ Enhancements

  • Iceberg Partition Clustering

    • New Partition Clustering Feature: You can now efficiently manage large datasets partitioned on high-cardinality columns using partition clustering. This feature optimizes storage by merging small files, improving performance, reducing query times, and minimizing S3 API costs.

    • Improved Query Performance: By clustering partitions and reducing the number of small files, full table scans and data refresh processes are significantly faster

    • When to Use: Partition clustering is ideal for datasets with high cardinality, frequent data arrival, and skewed data distribution.

    • How to Use: When creating a table with partition clustering, use the CLUSTERED BY clause instead of PARTITIONED BY.

    • Please see the complete documentation for more details, including usage scenarios, limitations, and syntax options.

  • Data Lineage Enhancements

    • Improved Visual Distinction: Previously, job source tables and lookup tables (materialized views) had similar visual representations, leading to confusion. We’ve enhanced the clarity by differentiating the arrows between jobs and materialized views from those from source tables.

    • Additional UX Improvements: Various user experience enhancements have been made to further improve the overall workflow and usability.

  • Support adding tables to the STOPPED_TABLES list in replication jobs

🔧 Bug Fixes

  • SQL Server CDC: Parse columns of type DateTime2 as Timestamp

  • Fixed Expire Snapshots, leaving dangling files

  • Using LIMIT with a sync job reading from Iceberg caused the job never to process data

2024.09.01-09.29

⬆️ Enhancements

  • Improved performance of Iceberg compactions

🔧 Bug Fixes

  • Fixed duplicate data in MERGE jobs to Iceberg. In rare cases, Iceberg would drop delete files prematurely (before compaction), causing old rows to remain in the table

  • Fixed incorrect information in recent_compactions system table and monitoring page

  • Fixed a bug where JSON data files in jobs writing to Snowflake were not being deleted

  • Fixed an issue where using a classic data source in a job caused errors when a field had multiple types during job creation.

Last updated