LogoLogo
OverviewQuickstartsHow To GuidesReferenceArticlesSupport
Overview
Overview
  • Welcome to Upsolver
  • GET STARTED
    • What is Upsolver?
    • Schedule a Demo
    • Start Your Free Trial
    • Apache Iceberg
  • RESOURCES
    • Reference
    • Iceberg Academy
    • Blog
    • Chill Data Summit
    • Community
    • Videos
  • RELEASE NOTES
    • March 2025
    • February 2025
    • January 2025
    • Earlier Releases
      • 2024
        • December 2024
        • November 2024
        • October 2024
        • September 2024
        • August 2024
        • July 2024
        • June 2024
        • May 2024
        • April 2024
        • March 2024
        • February 2024
        • January 2024
      • 2023
        • December 2023
        • November 2023
          • Deprecated Job Option
        • October 2023
        • September 2023
        • August 2023
        • July 2023
        • June 2023
        • May 2023
  • Legal
Powered by GitBook
On this page
  • Welcome to the Lakehouse
  • Upsolver & Apache Iceberg
  • Ingest Data to Apache Iceberg
  • Optimize Your Iceberg Tables
  • Find Iceberg Tables for Optimization
  • Further Learning
  1. GET STARTED

Apache Iceberg

Discover how Upsolver can ingest your data to Iceberg, and analyze and optimize your lakehouse for reduced storage costs and optimized data scans.

Last updated 10 months ago

Welcome to the Lakehouse

Data lakes are cost-effective for storing unlimited volumes of data - ideal for streaming big data and scaling at pace. However, with no inherent organization or understanding of what it holds, or a systematic way to develop relational mapping, it is no wonder that 85% of self-managed lakes encounter issues leading to failure within the first year.

Apache Iceberg introduces a new open table format standard to overcome the limitations of the lake. By tracking a canonical list of files rather than a directory, it brings database-like features to data lakes, including transactional concurrency, support for schema evolution, and time-travel and rollbacks through the use of snapshots.

Leveraging many of the features of a data warehouse on top of a data lake, Apache Iceberg builds a lakehouse that combines cheap storage with concurrent and fast transactions.

With a standard that is increasingly being adopted, Iceberg’s open table format allows any engine to read from, and write to, Iceberg tables, without adversely impacting other concurrent operations.

Upsolver & Apache Iceberg

While Apache Iceberg delivers an evolution to the data lake, it still requires intervention to compact and tune the files and partitions that comprise your tables.

It is important to manage your lakehouse to ensure you are not overpaying on storage and that data scans are efficiently returning results to the query engine as quickly as possible.

Not only does Upsolver support ingesting your data to Iceberg tables, we offer tools for auditing your Iceberg tables and an optimizer to compact and tune your tables.


Ingest Data to Apache Iceberg

We support ingesting your data from the major data platforms into Iceberg. After creating your pipelines, Upsolver takes care of managing your tables and maintaining performance so your users can query the data without experiencing the delays caused by long-running scans.

Upsolver will automatically manage your tables by running a compaction process based on industry best-practice. The compaction operations will be run at the optimal time to deliver the best results and, by reducing the size of your tables and number of files, you will save money on storage and benefit from faster data scans.

Ingest your data to Apache Iceberg from streaming, database, and file sources.

Ingest Your Data to Iceberg


Optimize Your Iceberg Tables

Reduce costs and accelerate your queries for any Iceberg table. Our Iceberg Analyzer continuously monitors and optimizes Iceberg tables, whether created by Upsolver or another tool. We automatically apply data engineering best practices to reduce storage costs and accelerate query performance - no managing optimization jobs or custom code needed!

Run the Iceberg Analyzer to discover the tables that can be tuned and compacted:

Upsolver compacts your files to reduce the size of your tables, which lowers the cost of your storage and increases data scans for increased query performance:

Optimize Your Iceberg Tables


Find Iceberg Tables for Optimization

Quickly find tables in your lakehouse that need compaction to reduce storage and increase data scans. If you have already built your Iceberg lakehouse, you can install the Upsolver Iceberg Table Analyzer CLI tool to quickly analyze your existing lakehouse and identify problematic Iceberg tables.

View the percentage of improvement that you can gain for your tables:

Download and run our open source CLI tool and uncover tables that can benefit from optimization.

Analyze Your Iceberg Tables


Blogs

Videos

To ingest your data into Apache Iceberg, begin with the and follow the step by step guide to setting up your environment and building a pipeline.

Upsolver analyzes your Iceberg tables to uncover where storage and performance improvements can be made.

Our standalone optimization tool can help you tune your existing lakehouse. All you need is a connection to or , and you can be analyzing and optimizing your tables in minutes.

Read the how-to guide, which will walk you end to end through the process of connecting to your catalog, analyzing your tables, and running the optimizer.

Use this quickstart guide to and begin analyzing your Iceberg tables.

Further Learning

✨
Iceberg 101: Ten Tips to Optimize Performance
Iceberg 101: Working with Iceberg Tables
Iceberg 101: Better Data Lakes with Apache Iceberg
Iceberg 101: What is the Iceberg Table Format?
An Introduction to the Data Lakehouse for Data Warehouse Users
Ingesting Operational Data into an Analytics Lakehouse on Iceberg
Iceberg Architecture Examples: How Iceberg Powers Data and ML Applications
Lakehouse vs. Data Lake: The Ultimate Guide
How Apache Iceberg is Reshaping Data Lake File Management
Apache Iceberg vs Parquet – File Formats vs Table Formats
Write Iceberg tables with Upsolver!
How to: Create an Iceberg Lakehouse for Snowflake using Upsolver
Install the Iceberg Table Analyzer CLI
The Iceberg Table Analyzer uncovers the tables that can benefit from compaction.
Optimize Your Iceberg Tables
Learning Paths
AWS Glue Data Catalog
Tabular