Statistics
Last updated
Last updated
The Statistics tab is available for Iceberg tables managed by Upsolver and offers a comprehensive overview of your table's current state and historical trends. This page is crucial for monitoring the health, performance, and maintenance of your table by tracking key metrics and processes, including storage usage, scan overhead, file count, snapshot expiration, compaction, and more.
While the Statistics tab provides a high-level summary, you can explore specific processes in greater detail—such as compaction, snapshot expiration, and orphan file cleanup—within the Maintenance tab.
This widget provides an overview of the current snapshot of the table, including the size of the current snapshot, total storage used by all snapshots, the number of files, and average file size. It also displays key statistics like data row count, delete rows, and the number of columns.
Category | Metric | Description |
---|---|---|
Storage | Current Snapshot Size | The total size of the latest table snapshot, including Iceberg metadata size. |
Storage | All Snapshots Size | The cumulative size of the table, including all its snapshots and associated data. This figure also includes the Iceberg metadata size. |
Scan | Full Table Scan overhead | The total request overhead of S3 API calls to perform a full table scan. |
Scan | Files | The total number of data files and metadata files in the current snapshot of the table. |
Scan | Avg.File Size | The average size of files in the current snapshot. Computed as (Current Snapshot Size)/(Files) |
Table | Data Rows | The total number of rows present in the data files of the current snapshot of the table. |
Table | Position Delete Rows | The total number of position delete rows in the current snapshot of the table. A position delete corresponds to a specific data row that has been removed. Note: Position delete rows may overlap with equality delete rows or may reference a non-existent row. |
Table | Equality Delete Rows | The total number of equality delete rows in the current snapshot of the table. An equality delete can be applied to multiple data rows. |
Table | Columns | The total number of columns in the current snapshot of the table. |
Partitions | Partitions | The total number of partitions in the current snapshot of the table. |
Partitions | Avg. Partition Size | The average size of each partition in the current snapshot. |
Partitions | Max. Partition Size | The size of the largest partition in the current snapshot. |
This widget highlights various metrics related to maintenance tasks performed by Upsolver, including compaction, snapshot expiration, orphan file cleanup, and data retention.
Category | Metric | Description |
---|---|---|
Time Travel | Oldest Snapshot | The timestamp of the oldest snapshot available on the main branch for time travel queries, allowing users to query the table's state at that specific time. |
Time Travel | Snapshots | The number of available snapshots. |
Data Retention | Oldest Retained Time | The oldest date for which data is retained in the current snapshot. Note that data will be completely deleted once all snapshots referencing that data have expired. If no retention policy is defined for the table, this value will display as 'Indefinitely' |
Compaction | Compaction Score | The compaction score represents the table's compaction level as a percentage (0-100%).
Calculation: The score is determined as the minimum of two ratios:
|
Compaction | Last Compaction Time | The date and time when the last compaction operation was performed. |
Compaction | Files Reduced - Lifetime | The total number of files reduced through compaction operations over the lifetime of the table. |
Orphan Files | Last Clean Up Time | The last date and time a cleanup operation was run to delete dangling files. |
Orphan Files | Files deleted - lifetime | The The number of files deleted during the last cleanup operation. |
Orphan Files | Storage reduced - lifetime | The total amount of storage space saved through ckean up operations over the lifetime of the table. |
In this section, you can monitor trends over time for the following metrics. You can adjust the timeframe to view trends across different time ranges.
Storage - This graph displays the trend of table size over time using a stacked area chart. It comprises the size of the table's current snapshot and the historical snapshots' storage size over time. The cumulative graph represents the total storage of the table. You can toggle between viewing only the "Current State" or "Historical Snapshots" trends separately.
Full Table Scan Overhead - Shows the total S3 API request overhead required to perform a full table scan over time.
Files - Displays the number of files in the current snapshot over time.