MongoDB
This article describes how to ingest CDC data from your MongoDB database.
Last updated
This article describes how to ingest CDC data from your MongoDB database.
Last updated
Prerequisites
Ensure you have a connection to your database and that it has been enabled for CDC.
Please read the guide to for more information.
You can create a job to ingest your data from MongoDB into a staging table in the data lake.
Jump to
After completing the prerequisites, you can create your staging tables. The example below creates a table without defining columns or data types, as these will be inferred automatically by Upsolver, though you can define columns if required:
Upsolver recommends partitioning by the system column $event_date
or another date column within the data in order to optimize your query performance.
Next, create an ingestion job as follows:
The example above only uses a small subset of all job options available when reading from MongoDB. Depending on your use case, you may want to configure different options.
Transformations can be applied to your ingestion job, for example, to exclude columns, correct issues, or mask data, before it lands in the target. Furthermore, you can use expectations to define data quality rules on your data stream and take appropriate action.
You can alter some of the options of an existing job. For example, if you want to keep the job as is, but only change the cluster that is running the job, execute the following command:
Note that some options such as COMPRESSION
cannot be altered once the job has been created.
If you no longer need a job, you can easily drop it using the following SQL command:
Learn More
Please see the SQL command reference for for full details of the available job options and further examples.