Change log

Change log for SQLake (
Release notes for can be found here.


  • Support creating query clusters and attaching materialized views to query clusters in order to query them via HTTP API
  • Add IF EXISTS syntax to DROP statements, e.g. DROP TABLE IF EXISTS "my_table", to prevent the statement from failing if an entity does not exist. Applies to DROP CLUSTER, DROP CONNECTION, DROP TABLE, DROP JOB, and DROP MATERIALIZED VIEW.
  • Snowflake Jobs:
    • Create the Snowflake table when there are no dynamic columns and the CREATE_TABLE_IF_MISSING option is TRUE


  • Support is now available for using an external Hive Metastore as a catalog
  • PostgreSQL CDC:
    • Tables that aren't included in the publication will not be part of the snapshot
  • Apache Kafka Jobs:
    • When copying data from Kafka topics, names are now treated as globs (stars match any number of chars, and question marks match one char)
  • Elasticsearch Jobs:
    • Write timestamp and date types as ISO-8601 strings in jobs that write to Elasticsearch
  • Support added for il-central-1 region. This region is currently only supported with private VPC deployments
  • Reduced the number of Amazon S3 API calls to lower S3 costs
Bug Fixes
  • Synced transformation jobs with an interval smaller than one of the jobs writing to a source table, that did not read the respective data
  • Minor bug fixes


  • Snowflake Jobs:
    • SELECT * will preserve the original case of field names in variant columns
  • SQL: Allow altering EXPOSE_IN_CATALOG property in tables
  • Performance Improvement: Reduce the number of file operations when coordinating future table operations
  • Write Timestamp and Date types as ISO-8601 strings in jobs that write to Elasticsearch
  • Ingestion wizard:
Bug Fixes
  • Jobs:
    • When using MAP_COLUMNS_BY_NAME, the EXCEPT columns list was fixed to be case-insensitive


  • Write Timestamp and Date types as ISO-8601 strings in string output jobs, for example: job to Amazon S3 with format JSON/CSV
  • Write Timestamp and Date types as ISO-8601 in RECORD_TO_JSON function
Bug Fixes
  • Performance improvements in CDC jobs
  • Performance improvements when querying the Upsolver Query Engine
  • Minor bug fixes


  • Users can now omit the connection type when specifying a source or target in jobs (e.g. INSERT INTO S3 catalog LOCATION = '...' can be replaced in INSERT INTO catalog LOCATION = '...')
  • Improved the performance of CDC jobs reading from databases with a large number of tables
  • Querying the information schema tables now returns Jobs and Materialized Views in deleting state
  • Elasticsearch Jobs:
    • MERGE jobs now support deleting documents by using the WHEN MATCHED AND ... THEN DELETE syntax
  • Upgraded Avro and Parquet libraries to the latest versions
Bug Fixes
  • Major improvements when reading from a table with a large number of partitions
  • Minor bug fixes


Bug Fixes
  • Minor bug fixes


Bug Fixes
  • Minor bug fixes


Bug Fixes
  • Snowflake Jobs:
    • Fixed an issue with a custom insert/update expression causing the job to fail if the field is also mapped in the select statement
    • On auto-managed tables, Upsolver will not create an extra column if the following conversion happens:
      • Original column is Double and got a value of type Long
      • Original column is Timestamp and got a value of type Date
      • Original column is Varchar
      • Original column is Variant In all other cases, we will create an extra column with the new type as the column name suffix. For example: if a column col was of type Bigint and got a Double value, we will create a column COL_DOUBLE in the Snowflake table
    • Fixed a delay in Materialized View on Job List/Index page


Bug Fixes
  • Fixed tree on fields containing dots, e.g. turning {"a\.b": 1} to {"a.b": 1}.
  • Snowflake jobs: changed the file format to copy from Avro to JSON. This fixed an issue when ingesting records with sub-fields that have special characters.


  • Added VALUE_INDEX_IN_ROW() - this function receives an element of an array of records and returns the 1-based index of the element position (incrementing regardless of whether the array is nested). Null values are not counted.
  • Added VALUE_INDEX_IN_ARRAY() - this function receives an element of an array of records and returns the 1-based index of the element position (index resets to 1 for each sub-array). Null values are not counted.
  • Ingestion wizard - support added for creating a heartbeat table within the wizard.
Bug Fixes
  • Reduce the frequency of metadata queries to Snowflake in order to reduce the cost of COMPUTE SERVICES charged by Snowflake.
  • [BREAKING CHANGE] Fixed RECORD_TO_JSON on fields containing dots, e.g. turning {"a.b": 1} to {"a\.b": 1}.


Bug Fixes
  • Minor bug fixes and improvements


  • New UUID() function returns a unique identifier (UUID) string.
  • PostgreSQL CDC: ignore rows from the heartbeat table
  • Upgraded Debezium version from 2.1.3 to 2.2.1
  • Ingestion wizard:
    • Supports compute cluster input (in case the organization has more than one compute cluster)
    • Supports basic expectation
  • The cluster version appears in the UI on the clusters page
  • Snowflake table: Show Variant columns statistics on field level
  • Sign-out is now available from the main screen
Bug Fixes
  • Fixed the conversion of float to double to preserve the perceived semantic value in CDC sources and in data sources that get Avro or Parquet
  • Minor bug fixes and improvements


  • Added $row_number system column to transformation jobs
  • [BREAKING CHANGE] Changed $row_number system field from 0-based to 1-based
  • Added $item_index system column, representing the source batch's row index. For example, in S3 sources, it will be the row index in a file
  • S3 outputs now support the inclusion of the shard number in the target path. This allows the use of output shards without overwriting the output files
  • User information and organization name now displayed on the main pages with the ability to switch between organizations
Bug Fixes
  • Fixed a bug in the IS_DUPLICATE function that caused the wrong results when the job is running with an interval higher than 1 minute
  • Fixed a bug reading Avro and Parquet files that caused fields of type Date to be ignored
  • Minor bug fixes


Bug Fixes
  • Fixed an issue reading from empty Kafka topics that contain empty partitions
  • Fixed a bug reading Avro files that use a named type more than once
  • Minor bug fixes
  • Snowflake table statistics are now available


Bug Fixes
  • Snowflake Merge Jobs: enforce the ON clause expression to prevent creating an array
  • Minor bug fixes


Bug Fixes
  • Job status page improvements
  • Minor bug fixes
  • CDC: PostgreSQL with partitioned tables - expose data.full_partition_table_name field specifying the name of the event's original partition
  • Error messages improvements


Bug Fixes
  • CASE WHEN now handles NULL as input and returns the ELSE value
  • CDC: Fixed the bug that caused the ingestion of a decimal type column to be converted to binary base64 string


Bug Fixes
  • COLUMN_TRANSFORMATIONS with dependencies between them created the wrong name for the nested column
  • Fixed target name column value for Snowflake outputs in the table
  • Validate that the first parameter in an ARRAY_JOIN is not a literal
  • Ingestion wizard now supports Amazon Kinesis source


Bug Fixes
  • Fixed the bug where TABLE_DATA_RETENTION could be disabled by disabling compactions
  • Dropping a table while specifying DELETE_DATA = true did not delete data files written by jobs with RUN_PARALLELISM > 1
  • Parquet Files are now distributed more evenly when ingesting data from Amazon S3 with high execution parallelism
  • Fixed a bug when selecting from large Materialized Views with predicates on key columns would return "Query exceeded input row limit"
  • Fixed a bug where a job reading from information_schema.columns does not write data into a table
  • Fixed a bug where querying can result in an error
  • BYTES_SUBSTRING position now starts from 1 as SUBSTRING (previously started from 0)
  • New SQL syntax is now supported:
    • SHOW CREATE JOB "Job name"
    • SHOW CREATE TABLE "Table name"
    • SHOW CREATE CLUSTER "Cluster name"
  • PostgreSQL CDC: Support reading 14+ partitioned tables by the root table name instead of the underlying partition table names
  • Snowflake: Added query tag to queries executed by Upsolver for easier cost tracking


Bug Fixes
  • Improved statistics in Job Status
  • Fixed the issue of inviting a member to the organization not working
  • Prevented the creation of sync jobs that read from system tables
  • Fixed a bug in jobs when writing to Amazon S3 with a dynamic location
  • Fixed a bug that caused some columns to be missing when reading from a table
  • Support querying all system tables using the syntax: SELECT $*
  • Information Schema: added atype_evolution column to the system table system.information_schema.columnsto show all previous types of the column


Bug Fixes
  • Job Status page bug fixes
  • Improved error messages in the Ingestion Wizard
  • Added $event_date column to all transformation jobs that write to a Managed Upsolver table
  • SQL/AutoComplete: Show aggregation result fields
  • System Tables: added elastic IPs column to system.information_schema.clusters


Bug Fixes
  • Fixed an issue collecting field statistics and metadata for large data files with a large number of unique field names
  • Snowflake output job now supports SELECT *: creating and managing the snowflake table.
    • CDC to Snowflake SELECT *: Support ingested JSONB as a variant
  • Allow syntax in JOB: START_FROM = NOW - INTERVAL '6' HOURS
  • Delete intermediate files after copy to Redshift
  • Copy From Features: Add a Deduplication option to the COPY FROM job
  • Added PARSE_JSON_COLUMNS option to CDC COPY FROM jobs. This will parse any JSON typed columns in the database as nested objects in the target table.
  • SQL/AutoComplete: Show aggregation result fields
  • Support partial flattening of arrays in jobs that write to Upsolver tables: FLATTEN_PATHS = (A)
  • Ingestion wizard - Easy Ingest to Snowflake:
    • Step-by-step wizard, no SQL, no data lake tables. Supports significant data quantities, streaming data, and strong ordering of data. Comes with deduplication and field hashing capabilities
  • Execution results and event log experience will store outside the worksheet page. Users can return to the worksheet later and start where they left off
  • Job Status (beta)


Bug Fixes
  • Fixed the system catalog name from System to system
  • AvroRegistry content type: Support URL encoded authentication information
  • Snowflake: Support keeping old values on partial updates
  • Fixed "deleting" entities showing up in information_schema tables
  • Show all JDBC jobs on the tree
  • Show "Staging Location" in inspection panel of S3 Copy From jobs with enabled DELETE_AFTER_LOAD option
  • Add editor shortcuts to increase/decrease the font size CMD+/- on mac
  • S3 output file Type options (set delimiter for S3 outputs)
  • SQLake S3 output: Allow overwrite
  • Expose editor shortcuts in the help panel widget
  • Display the original file path in the "copy from job" info
  • Functions: New function: RECORD_TO_JSON


Bug Fixes
  • JDBC Outputs: delete intermediate files after being written to the DB
  • The cluster catalog is now visible in the tree
  • Gather all system entities in the tree under a catalog named "System"
  • Improved AS OF syntax
  • Auto complete on Jobs
  • Daily usage graph and report are available
  • Ability to decide which query engine to use to run a select statement (Athena/Upsolver)
  • Put all System entities under a catalog named "System"
  • Support AS of syntax
  • Information Schema: Add a table for columns


Bug Fixes
  • JDBC Outputs: delete intermediate files after being written to the DB
  • Cluster catalog is now visible in the tree
  • Gather all system entities in the tree under a catalog named "System"
  • Improved AS OF syntax
  • Auto complete on Jobs


Bug Fixes
  • Fixed Kafka batcher tasks getting stuck when reading with a wildcard topic and deleting all the topics in Kafka
  • Show information schema catalog on the tree
  • Auto Complete on Information Schema tables and columns
  • Allow creating jobs from Information Schema tables
  • Add support for Timestamp, Date, and Decimal types in CDC and AVRO sources
  • Added support for bigserial in Postgres outputs
  • Support EXCLUDE_COLUMNS for a COPY FROM (ingestion) job
  • Memory allocation optimizations in Lookup Table Query servers


Bug fixes
  • Fixed an issue when creating a Kafka Data Source with glob pattern that doesn't match any topics would cause no response in the API.
  • Upgrade debezium to V2.1.2
  • Support transformation job to PostgreSQL
  • Expose security information within the app to allow easier AWS configuration to connect your own data
  • Memory allocation optimizations in Lookup Table Query servers
  • Allow to alter Materialized View COMMENT.
  • Display managed entities in the tree even when can't connect to Athena.
  • Support ignoring fields in COPY FROM jobs by specifying the EXCLUDE_COLUMNS option.


Bug fixes
  • Fixed a rare issue that can cause duplicate data to be loaded into Redshift after copy failures
  • Use regional STS endpoints if available
  • Indication on an executed statement in the editor, successs/failed
  • Column appears immediately on the tree on creating transformation job
  • Support CAST expression in the language
  • Renamed function TO_LONG to TO_BIGINT
  • SELECT * now returns columns from joined Materialized View


  • Enhancements
    • Add support for information_schema queries
    • Add support for SKIP_VALIDATION and SKIP_ALL_VALIDATION options.
      • DEPRECATION: ALLOW_EMPTY_SOURCES will be deprecated in favor of the new options.
    • Added validation to prevent explicitly mapping fields with different data type to the defined output table columns.
    • Bug Fixes
    • Add support for hierarchical system columns in the tree


  • Enhancements
    • Support using non fully qualified names for tables and materialized views
    • Improve error message when trying to create a table with the same name as existing one
    • Support querying without WHERE statement (Infinite Window)
    • S3 Output Job: Support split files to folders
    • Home page redesign
    • Cluster management tab in the UI
  • Bug Fixes
    • Alter Cluster: Fix Alter to null not set default values
    • Transformation Jobs: Fix missing columns mapping validations for partition and key columns
    • API: Support join materialized view with array


  • Enhancements
    • Jobs monitoring will now show materialized views
    • API: Join with materialized view is no longer requiring alias for the mv
    • Auto completion for CREATE TABLE options
    • Added a new System Table jobs.transform_job_state that provides a summary status of all running transform jobs.
    • Remove RETENTION property from all transformation jobs
    • Improved query results tab
  • Bug Fixes
    • Fixed slow loading of Schemas under Athena connections in the tree
    • Job SQL Statement in Inspection Pane doesn't omit parenthesis when they're required
    • Monitoring: Fixed the 'job_name' of aggregation stages to be the original 'job_name' instead of "Output Aggregation". This means logs in the System Table 'logs.task_executions' will now have a correct 'job_name' for aggregation stages.
    • Fixed MAP_COLUMNS_BY_NAME is not needed for S3, Elasticsearch targets


  • Enhancements
    • New system columns added: $source_id , $shard_number , $row_number
    • Support Time Travel in joins!
    • Support running a SELECT query without a FROM clause
    • Support running a SELECT query reading from an Upsolver Classic Data Source
    • Support running UNNEST queries
    • Support selecting columns by their fully qualified name (e.g. catalog.schema.table.column)
    • Support select System Columns with glob patterns
  • Bug Fixes
    • Fixed slow replay progress for Snowflake and PostgreSQL outputs


  • Enhancements
    • Event log improvements + Present informative diagram for copy/transform jobs
    • Pipeline Monitoring: Expose filtered rows due to missing PK or Partition Column
    • CLI: Show only message on DDL commands success
    • Support using Classic Data Sources


  • Enhancements
    • Inviting a user to an organization is now supported
    • Expose the column data type in the tree
    • Improved CLI experience
    • New SQLake templates added (CDC MySQL, CDC PostgreSQL, Elasticsearch, Snowflake, Redshift)


  • Enhancements
    • Support Transformation Jobs to Elastic Search


  • Enhancements
    • PostgreSQL CDC: Moved TABLE_INCLUDE_LIST and COLUMN_EXCLUDE_LIST from job options to source definition
    • MySQL CDC: Moved TABLE_INCLUDE_LIST and COLUMN_EXCLUDE_LIST from job options to source definition
    • API token management


  • Enhancements
    • Support Transformation Jobs to S3
    • Support Copy From PostgreSQL Jobs
    • New Home Page
    • Support private VPC integration


  • Enhancements
    • New system tables: running_tasks, failing_tasks
    • Support Copy From MySQL Jobs
    • Support Transformation Jobs to Redshift


  • Features
    • New system tables were added:
      • running_tasks
      • failing_tasks
      • copy_from_job_status
  • Changes
    • Preview is now limited to a fixed amount of input rows. Queries that are too large for preview will be aborted
  • Bug Fixes
    • Jobs: transformation jobs with an interval larger than one minute did not handle cases where the start time or end time of the job was not fully aligned with that interval
Last modified 7d ago