April 2024

Upsolver new features, enhancements, and bug fixes for April 2024.

Release Notes Blog

For more detailed information on these updates, check out the Upsolver May 2024 Feature Summary blog.

2024.04.25-12.36

⬆️ Enhancements

Iceberg:
- Added support for writing to hidden partitions
- Enabled changing the partition specification of existing tables even while they are actively being written to by a job
- Support writing to External Iceberg tables
- Support altering Iceberg table properties via SQL

🔧 Bug Fixes

Worksheet tree - Show replication jobs under tables that were created dynamically
MongoDB CDC:
- Corrected the parsing of Decimal types to Double
- Resolved errors encountered when replicating collections containing fields with types Regex, Min Key, and Max Key

⬆️ Enhancements

Introduced the PARSE_DEBEZIUM_JSON_TYPE property to the Avro Schema Registry content format for dynamic parsing of JSON columns from Debezium sources into Upsolver records or keeping as JSON strings. For Snowflake outputs with schema evolution, fields are written to columns of type Variant.
Added support for Iceberg table retention using the TABLE_DATA_RETENTION property
Upgraded the Snowflake driver to 3.15.0
UI: ClickHouse wizard cosmetic changes

🔧 Bug Fixes

Fixed a bug preventing the pausing of ingestion jobs to Snowflake
Iceberg schema evolution:
- Nested fields were added without the field docs, which are later used to understand which field evolved from which. Affected tables may need to be recreated if jobs writing to them are causing errors
- Was not handling cases where a field can have multiple types (e.g., a field can be a record and can also be an array of strings)

Data lineage diagram is now accessible from Job Status, Datasets, and materialized view pages. Users can easily view real-time job status and dependencies
Ingestion wizard:
- ClickHouse is now supported as a target (CDC sources are not supported at this point)

⬆️ Enhancements

For new entities, you can now use the updated Parquet list structure (parquet.avro.write-old-list-structure = false) when writing Parquet files to S3 and Upsolver tables
Support casting strings to JSON in jobs writing to Iceberg tables
Previewing Classic Data Sources is now supported (SELECT * FROM "classic data source name")
COLUMN_TRANSFORMATIONS are now supported by replication jobs
Cost reduction:
- Reduced S3 API costs of replication jobs and single entity jobs
- Reduced S3 API costs of Iceberg tables
- Reduced S3 API costs of Hive tables
The OPTIMIZE option for external Iceberg tables now supports optimizing tables that are not partitioned
The cluster system table (system.monitoring.clusters) now shows data that is aligned with the Cluster Monitoring page

🔧 Bug Fixes

Fixed a bug that could skip data when reading from CDC sources
Fixed a bug where events Written graph wouldn't show for single entity jobs that contains a lot of sub jobs or where the job list page contains a lot of jobs
CDC Event log is now deleted right after parsing the log events
Fixed a bug where replication and single entity jobs wouldn't work when trying to create a table with a name that existed before
Increased performance of VPC integration experience
Fixed a rare bug where showing "Lifetime" statistics on the Datasets page wouldn't show the lifetime statistics
Fixed a bug where jobs that read data from the system.information_schema.columns would timeout when there were tables with a large number of columns
Fixed a bug where it was possible to drop a table that a replication or single entity job was writing into. The new behavior now requires that the job is dropped first
Fixed a bug where a single entity job that reads data from a table that is partitioned by time wouldn't read from the start of the table
Fixed a bug where the first point in the Datasets graph would have a timestamp that is before the start time of the first job that writes to a table

Last updated 1 year ago