Orphan Files

In distributed processing environments, tasks or jobs sometimes fail, leaving behind files that are not referenced in the table metadata. These files, referred to as orphan files, can accumulate over time and consume significant storage space.

A file is considered an orphan if it is not associated with a valid snapshot in the table metadata. Regular clean-up of these files is essential for optimizing storage and maintaining efficient table operations. The Orphan Files tab enables you to monitor clean-up jobs that remove these files.

The Orphan Files tab displays all files that are not referenced in the table metadata.

The following metrics provide insights into the clean-up process:

Metric
Description

Job Start time

The timestamp indicating when the clean up job started

Status

The current status of the clean up job. Possible values include: Running, Completed, and Failed (Retrying).

Duration

The total run time duration of the job.

Files Deleted

The total number of dangling files that were deleted.

Storage Size Deleted

The total amount of storage space (in bytes) freed by deleting dangling files.

Errors

Errors text in case errors were detected.

Last updated