Orphan Files

In distributed processing environments, tasks or jobs may sometimes fail, leaving behind files that are not referenced in the table metadata. These files, referred to as orphan files, can accumulate over time and consume significant storage space.

A file is considered an orphan if it is not associated with any valid snapshot in the table metadata. Regular cleanup of these files is essential for optimizing storage and maintaining efficient table operations.

Orphan Files Tab

The Orphan Files tab allows you to monitor cleanup jobs that remove these files. The following metrics provide insights into the cleanup process:

MetricDescription

Job Start time

The timestamp indicating when the clean up job started

Status

The current status of the clean up job. Possible values include: "Running," "Completed", "Failed (Retrying)".

Duration

The total run time duration of the job.

Files Deleted

The total number of dangling files that were deleted.

Storage Size Deleted

The total amount of storage space (in bytes) freed by deleting dangling files.

Errors

Errors text in case errors were detected.

Last updated