Mapping data to a desired schema

This page goes over how to map your data to a desired schema using Transform with SQL in Upsolver.

Transform with SQL enables you to map your data into a desired schema. The process of mapping in Upsolver is done in two simple steps:

Ingest your raw data into the data lake by configuring a data source in Upsolver.
Map your data to a desired schema by configuring an output using SQL.

To demonstrate, assume we have the following data in CSV format in a data source called Purchases (all values in this table are strings).

purchase_id

customer_id

product_name

quantity

unit_price

Orange

0.25

Apple

0.5

Banana

0.25

Define simple schema

If we define a table as:

SELECT customer_id, purchase_id, product_name
FROM Purchases

The resulting table reflects that query as events stream in. The final table contains the data:

customer_id

purchase_id

product_name

“1”

“Orange”

“1”

“2”

“Apple”

“1”

“3”

“Banana”

Rename column names

Renaming of column names is being done as follows using the AS statement:

SELECT customer_id AS Customer, purchase_id AS Purchase, 
       product_name AS Product
FROM Purchases

The column names in the resulting table have been renamed:

Customer

Purchase

Product

“1”

“Orange”

“1”

“2”

“Apple”

“1”

“3”

“Banana”

Data type conversion

Conversion of data types is done using : next to the selected column name.

We will demonstrate a conversion of the column quantity from the example data source Purchases (which is in string format) into BIGINT format.

SELECT customer_id, purchase_id, product_name, quantity:BIGINT
FROM Purchases

The resulting table reflects the conversion:

customer_id

purchase_id

product_name

quantity

“1”

“Orange”

“1”

“2”

“Apple”

“1”

“3”

“Banana”

Perform calculations

It is possible to perform inline calculations when defining the schema.

If we define a table as:

SELECT customer_id, purchase_id, product_name, 
       quantity:BIGINT * unit_price:BIGINT as total_cost:BIGINT
FROM Purchases

The result of the query is the following table which contains the calculated field total_cost:

customer_id

purchase_id

product_name

total_cost

“1”

“Orange”

0.75

“1”

“2”

“Apple”

0.5

“1”

“3”

“Banana”

0.5

It is also possible to first calculate the field and then just use it in the query:

SET total_cost = quantity:BIGINT * unit_price:BIGINT;
SELECT customer_id, purchase_id, product_name, total_cost
FROM Purchases

PreviousTransform with SQL NextTransforming data with SQL

Last updated 4 years ago

Was this helpful?