September 2024
Upsolver new features, enhancements, and bug fixes for September 2024.
2024.09.10-07.45
⬆️ Enhancements
Iceberg Partition Clustering
New Partition Clustering Feature: You can now efficiently manage large datasets partitioned on high-cardinality columns using partition clustering. This feature optimizes storage by merging small files, improving performance, reducing query times, and minimizing S3 API costs.
Improved Query Performance: By clustering partitions and reducing the number of small files, full table scans and data refresh processes are significantly faster
When to Use: Partition clustering is ideal for datasets with high cardinality, frequent data arrival, and skewed data distribution.
How to Use: When creating a table with partition clustering, use the CLUSTERED BY clause instead of PARTITIONED BY.
Please see the complete documentation for more details, including usage scenarios, limitations, and syntax options.
Data Lineage Enhancements
Improved Visual Distinction: Previously, job source tables and lookup tables (materialized views) had similar visual representations, leading to confusion. We’ve enhanced the clarity by differentiating the arrows between jobs and materialized views from those from source tables.
Additional UX Improvements: Various user experience enhancements have been made to further improve the overall workflow and usability.
Support adding tables to the
STOPPED_TABLES
list in replication jobs
🔧 Bug Fixes
SQL Server CDC: Parse columns of type
DateTime2
asTimestamp
Fixed Expire Snapshots, leaving dangling files
Using LIMIT with a sync job reading from Iceberg caused the job never to process data
2024.09.01-09.29
⬆️ Enhancements
Improved performance of Iceberg compactions
🔧 Bug Fixes
Fixed duplicate data in
MERGE
jobs to Iceberg. In rare cases, Iceberg would drop delete files prematurely (before compaction), causing old rows to remain in the tableFixed incorrect information in recent_compactions system table and monitoring page
Fixed a bug where JSON data files in jobs writing to Snowflake were not being deleted
Fixed an issue where using a classic data source in a job caused errors when a field had multiple types during job creation.
Last updated