Once you run an output it becomes immutable (as it is a production task); this is to ensure data lineage.
You can also:
If you then run the edited output, it automatically writes to a different location. It is therefore possible to identify which files were created by a specific output.
See: Edit an output​
Once duplicated, the output can be modified as required (e.g. change the type of the destination output).
See: Duplicate an output​
1. From the Outputs page, select the output that you want to run.
2. Click Run.
3. If applicable, select the S3 Storage to store the data of the table.
The access key of this storage must belong to the same AWS account as the access key of the connection.
4. Complete any other fields (these vary according to the output type):
Select Qubole, Athena, or Redshift Spectrum.
The required connection.
See: Connections​
The database name.
The schema within the database.
The table to create. Table names are case-insensitive.
Note: Apache Spark requires lowercase table names.
See: Data outputs for details regarding a specific output type.
5. (Optional) To add an additional table, click Add and specify the options as described in the previous step.
6. Click Next and complete the following:
Select the compute cluster to run the calculation on. Alternatively, click the drop-down and create a new compute cluster.
The range of data to process. This can start from the data source beginning, now, or a custom date and time. This can never end, end now, or end at a custom date and time.
7. Click Deploy.
Once the output shows as Running in the output panel, it is live in production and consuming compute resources.