Run an output
This page provides a general guide on how to run an output in Upsolver.
Once you run an output it becomes immutable (as it is a production task); this is to ensure data lineage.
You can also:
Edit an output
Duplicate an output
If you then run the edited output, it automatically writes to a different location.
It is therefore possible to identify which files were created by a specific output.
Once duplicated, the output can be modified as required (e.g. change the type of the destination output).
1. From the Outputs page, select the output that you want to run.
2. Click Run.
3. If applicable, select the S3 Storage to store the data of the table.
The access key of this storage must belong to the same AWS account as the access key of the connection.
4. Complete any other fields (these vary according to the output type):
Hive Metastores
Connection
Database Name
Schema
Table Name
Select Qubole, Athena, or Redshift Spectrum.
The required connection.
The database name.
The schema within the database.
The table to create. Table names are case-insensitive.
Note: Apache Spark requires lowercase table names.
5. (Optional) To add an additional table, click Add and specify the options as described in the previous step.
6. Click Next and complete the following:
Compute Cluster
Processing Time Range
Select the compute cluster to run the calculation on.
Alternatively, click the drop-down and create a new compute cluster.
The range of data to process.
This can start from the data source beginning, now, or a custom date and time. This can never end, end now, or end at a custom date and time.
7. Click Deploy.
Once the output shows as Running in the output panel, it is live in production and consuming compute resources.
Last modified 3yr ago