Creating an Amazon S3 data source on Upsolver is very simple.
Upsolver will auto-detect the file formats date pattern and show you a preview of the top files to be ingested with their corresponding dates. It will also auto-detect when to start the ingestion from.
By default, data will be ingested into the default target storage.
1. From the Data Sources page, click New.
2. Select Amazon S3.
3. Select the Amazon S3 bucket to read from.
4. (Optional) Enter the path to the data folder in the Amazon S3 bucket (e.g. billing data). If this is not specified, the data is assumed to be in the top-level of the hierarchy.
5. In Glob File Pattern, specify the file set to ingest by using a file pattern with wildcard characters (where * means ingest all files).
6. Select or enter in the date pattern of the files to be ingested. This is autodetected but can be modified if required.
7. (Optional) Select a time to start ingesting files from. This is usually auto detected, but if there is no preview, the date cannot be established and this defaults to today's date. In this case, set the required start date.
8. Name this data source.
9. (Optional) If you would like to customize the data source by:
specifying a file name pattern
specifying the content format and its associated content format options (e.g. custom delimiter for CSV)
selecting a specific compute cluster
selecting a target storage option
10. Click Continue.
11. In the S3 Bucket Integration window, click Launch Integration to launch the AWS CloudFormation page in a new tab.
12. Check the I acknowledge statement and click Create Stack.