Links

I can't ingest my Amazon S3 data

If you are trying to ingest data from an Amazon S3 bucket not partitioned by date, note that the START_FROM option is set to NOW.
This means that if no new data has arrived since you started running the job, there is also no data ingested into your staging table.
You can also set the START_FROM option to BEGINNING or to a specific timestamp when there is data, but this is only possible when reading from a bucket partitioned by DATE_PATTERN.

Example:

If the list of files is:
  • s3://bucket/input/a/2019/01/01/00/00/file.json
  • s3://bucket/input/a/2019/01/01/00/01/file.json
  • s3://bucket/input/a/2019/01/01/00/02/file.json
  • s3://bucket/input/a/2019/01/01/00/03/file.json
You can read your data from these files as follows:
CREATE JOB copy_from_s3
CONTENT_TYPE = JSON
START_FROM = timestamp '2019-01-01'
DATE_PATTERN = 'yyyy/MM/dd/HH/mm'
AS COPY FROM S3 my_s3_connection
BUCKET = 'bucket'
PREFIX = 'input/a'
INTO default_glue_catalog.schema_name.table_name;
If you are still experiencing issues, please file a ticket via Upsolver Support Portal.