Statistics

The Statistics tab is available for Iceberg tables managed by Upsolver and offers a comprehensive overview of your table's current state and historical trends. This page is crucial for monitoring the health, performance, and maintenance of your table by tracking key metrics and processes, including storage usage, scan overhead, file count, snapshot expiration, compaction, and more.

While the Statistics tab provides a high-level summary, you can explore specific processes in greater detail—such as compaction, snapshot expiration, and orphan file clean-up—within the Maintenance tab.

The Statistics tab displays real-time metrics on the current snapshot, maintenance, and trends withing your Iceberg table.

Current Snapshot

This widget provides an overview of the current snapshot of the table, including the size of the current snapshot, total storage used by all snapshots, the number of files, and average file size. It also displays key statistics like data row count, delete rows, and the number of columns.

Category
Metric
Description

Storage

Current Snapshot Size

The total size of the latest table snapshot, including Iceberg metadata size.

Storage

All Snapshots Size

The cumulative size of the table, including all its snapshots and associated data. This figure also includes the Iceberg metadata size.

Scan

Full Table Scan overhead

The total request overhead of S3 API calls to perform a full table scan.

Scan

Files

The total number of data files and metadata files in the current snapshot of the table.

Scan

Avg.File Size

The average size of files in the current snapshot. Computed as (Current Snapshot Size)/(Files)

Table

Data Rows

The total number of rows present in the data files of the current snapshot of the table.

Table

Position Delete Rows

The total number of position delete rows in the current snapshot of the table. A position delete corresponds to a specific data row that has been removed. Note: Position delete rows may overlap with equality delete rows or may reference a non-existent row.

Table

Equality Delete Rows

The total number of equality delete rows in the current snapshot of the table. An equality delete can be applied to multiple data rows.

Table

Columns

The total number of columns in the current snapshot of the table.

Partitions

Partitions

The total number of partitions in the current snapshot of the table.

Partitions

Avg. Partition Size

The average size of each partition in the current snapshot.

Partitions

Max. Partition Size

The size of the largest partition in the current snapshot.

Maintenance

This widget highlights various metrics related to maintenance tasks performed by Upsolver, including compaction, snapshot expiration, orphan file clean-up, and data retention.

Category
Metric
Description

Time Travel

Oldest Snapshot

The timestamp of the oldest snapshot available on the main branch for time travel queries, allowing users to query the table's state at that specific time.

Time Travel

Snapshots

The number of available snapshots.

Data Retention

Oldest Retained Time

The oldest date for which data is retained in the current snapshot. Note that data will be completely deleted once all snapshots referencing that data have expired.

If no retention policy is defined for the table, this value will display as Indefinitely.

Compaction

Compaction Score

The compaction score represents the table's compaction level as a percentage (0-100%).

  • 100%: The table is near its optimal state, with minimal size and high scan efficiency.

  • 0%: Indicates significant room for improvement in reducing table size or enhancing scan efficiency.

Calculation: The score is determined as the minimum of two ratios:

  • Size Ratio: Projected Table Size / Current Table Size

  • Scan Overhead Ratio: Projected Scan Overhead / Current Scan Overhead

Compaction

Last Compaction Time

The date and time when the last compaction operation was performed.

Compaction

Files Reduced - Lifetime

The total number of files reduced through compaction operations over the lifetime of the table.

Orphan Files

Last Clean Up Time

The last date and time a clean-up operation was run to delete dangling files.

Orphan Files

Files deleted - lifetime

The The number of files deleted during the last clean-up operation.

Orphan Files

Storage reduced - lifetime

The total amount of storage space saved through clean-up operations over the lifetime of the table.

In this section, you can monitor trends over time for the following metrics. You can adjust the timeframe to view trends across different time ranges.

  1. Storage - This graph displays the trend of table size over time using a stacked area chart. It comprises the size of the table's current snapshot and the historical snapshots' storage size over time. The cumulative graph represents the total storage of the table. You can toggle between viewing only the Current State or Historical Snapshots trends separately.

  2. Full Table Scan Overhead - Shows the total S3 API request overhead required to perform a full table scan over time.

  3. Files - Displays the number of files in the current snapshot over time.

Last updated