The Schema tab applies to source and target datasets that Upsolver ingests and transforms.
The timestamp of the last update by jobs writing to the table, indicating when the table was last modified. This only applies to datasets in your data lake:
The Overview card provides instant visibility into the following:
Number of Fields: the total count of all fields in the dataset, including system columns that are automatically added by Upsolver.
Number of Unique Fields: the number of columns containing unique values. This number maps to the Has Value Repetition column in the Ingested Data Statistics table.
Number of Arrays Fields: the count of columns that contain array data out of the total count.
This graph is a visual display to show the number of rows ingested over the selected timespan, enabling you to quickly discover spikes or drops in your pipeline and troubleshoot unexpected volumes. By default, the chart shows written rows from the initial ingestion through to now., however you can change this. Click on the Time View button, which displays Lifetime by default, to change the timespan displayed in the dataset report:
You can select an area on the chart to drill into the statistics for a deeper dive into your data. If you need to travel back and find an issue, you can keep selecting the date and time to view the data exactly as it was at a specific point:
This table provides powerful metadata about your dataset. You can click on a column name in the table to drill deeper into the Column details, which includes stats for the Upsolver system columns appended to your dataset. To filter the columns in the dataset, type your search string into the Search box. Click the x icon in the Search box to clear the results.
The system columns that Upsolver appends to your datasets are included in the column list:
The Written Data Statistics table provides the following information:
Measurement | Description |
---|---|
Column | The name of the ingested column. Click on the column name to drill through to the Column page. |
Type | The column data type, e.g. |
Appears Unique | Returns a Boolean value displayed as a tick icon for |
Top Values | The most common three data values within this column. |
Distinct Values | The count of distinct values within the column. |
Density | The percentage of columns that have a value. |
Min | The minimum data value in this column. |
Max | The maximum data value in this column. |
First Seen | The first date and time that data was seen in this column. |
Last Seen | The last date and time that data was seen in this column. |
For datasets written to the data lake, you can download the statistics to a table inspection report in CSV format for your own consumption. Click the Download icon to the right of the Search box to download the contents of the Written Data Statistics table to CSV. You can then perform your own discovery using this data.