LogoLogo
OverviewQuickstartsHow To GuidesReferenceArticlesSupport
Support
Support
  • Support
  • FAQs
    • Get Started with Upsolver
    • Basic Elements of Upsolver
    • Iceberg Cloud Storage Breakdown
    • Infrastructure
    • Cost Estimator
  • TROUBLESHOOTING
    • AWS Configuration
      • CloudFormation Stack Failed to Deploy
      • Private API Doesn't Start or Can't Connect
        • Elastic IPs Limit Reached
        • EC2 Spot Instance Not Running
        • DNS Cache
        • Security Group Not Open
    • Cluster
      • Compute Cluster Doesn't Start
      • Can't Connect to Apache Kafka Cluster
    • Jobs
      • Problem Ingesting Amazon S3 Data
      • Data Doesn't Appear in Athena Table
      • Exception When Querying Athena Table
  • ERROR MESSAGES
    • Error Messages
      • Cluster
        • UP10020 COMPUTE_CLUSTER is Missing
      • Jobs
        • UP10010 Missing ON Condition
        • UP10030 Entity Already Exists
        • UP10040 Entity Not Found
        • UP10050 Materialized View Illegal Column Expression
        • UP10060 Statement Parsing
        • UP10100 Cannot Select Records in an UNNEST Statement
        • UP20010 Source Data Not Found
        • UP20040 Could Not DROP Entity Used by a Job or Materialized View
      • Replication
        • UP20050 Reached PostgreSQL Replication Slots Limit
        • UP20051 PostgreSQL Replication is Disabled
      • Security
        • UP20020 No Access to Database
        • UP20030 No Permissions to assumeRole
        • UP20060 Unable to Connect
        • UP20061 Unable to Connect to a Private Network
Powered by GitBook
On this page
  1. TROUBLESHOOTING
  2. Jobs

Problem Ingesting Amazon S3 Data

If you are trying to ingest data from an Amazon S3 bucket not partitioned by date, note that the START_FROM option is set to NOW.

This means that if no new data has arrived since you started running the job, no data has been ingested into your staging table.

To resolve this, you can set the START_FROM option to BEGINNING or to a specific timestamp when there is data, but this is only possible when reading from a bucket partitioned by DATE_PATTERN.

Example:

If the list of files is:

  • s3://bucket/input/a/2019/01/01/00/00/file.json

  • s3://bucket/input/a/2019/01/01/00/01/file.json

  • s3://bucket/input/a/2019/01/01/00/02/file.json

  • s3://bucket/input/a/2019/01/01/00/03/file.json

You can read your data from these files as follows:

CREATE JOB copy_from_s3
    CONTENT_TYPE = JSON
    START_FROM = timestamp '2019-01-01'
    DATE_PATTERN = 'yyyy/MM/dd/HH/mm'
AS COPY FROM S3 my_s3_connection 
    BUCKET = 'bucket' 
    PREFIX = 'input/a'
INTO default_glue_catalog.schema_name.table_name;

Last updated 1 year ago

If you are still experiencing issues, please raise a ticket via the .

Upsolver Support Portal