Ingested Data

The Ingested Data tab applies to source and target datasets that Upsolver ingests and transforms.

Table Last Update

The timestamp of the last update by jobs writing to the table, indicating when the table was last modified. This only applies to datasets in your data lake:

Overview

The Overview card provides instant visibility into the following:

Number of Fields: the total count of all fields in the dataset, including system columns that are automatically added by Upsolver.
Number of Unique Fields: the number of columns containing unique values. This number maps to the Has Value Repetition column in the Ingested Data Statistics table.
Number of Arrays Fields: the count of columns that contain array data out of the total count.

Written Rows Over Time

This graph is a visual display to show the number of rows ingested over the selected timespan, enabling you to quickly discover spikes or drops in your pipeline and troubleshoot unexpected volumes. By default, the chart shows written rows from the initial ingestion through to now., however you can change this. Click on the Time View button, which displays Lifetime by default, to change the timespan displayed in the dataset report:

You can select an area on the chart to drill into the statistics for a deeper dive into your data. If you need to travel back and find an issue, you can keep selecting the date and time to view the data exactly as it was at a specific point:

Written Data Statistics

This table provides powerful metadata about your dataset. You can click on a column name in the table to drill deeper into the Column details, which includes stats for the Upsolver system columns appended to your dataset. To filter the columns in the dataset, type your search string into the Search box. Click the x icon in the Search box to clear the results.

The system columns that Upsolver appends to your datasets are included in the column list:

The Written Data Statistics table provides the following information:

Measurement

Description

Column

The name of the ingested column.

Click on the column name to drill through to the Column page.

Type

The column data type, e.g. string, date, long. This is the type that Upsolver automatically inferred on ingestion.

Appears Unique

Returns a Boolean value displayed as a tick icon for TRUE if all values are distinct, or a cross icon for FALSE if there are repeated values.

Top Values

The most common three data values within this column.

Distinct Values

The count of distinct values within the column.

Density

The percentage of columns that have a value.

Min

The minimum data value in this column.

Max

The maximum data value in this column.

First Seen

The first date and time that data was seen in this column.

Last Seen

The last date and time that data was seen in this column.

For datasets written to the data lake, you can download the statistics to a table inspection report in CSV format for your own consumption. Click the Download icon to the right of the Search box to download the contents of the Written Data Statistics table to CSV. You can then perform your own discovery using this data.

Last updated 8 months ago