MySQL (AWS RDS) data output

Process streaming data from end-to-end within 10 minutes. The data source is an Amazon S3 bucket (updated continuously). Upsolver processes the data and loads to MySQL hosted by AWS.

To help you get started with Upsolver, you can try it out for free. You can choose between Upsolver Dedicated Compute and Community subscriptions. Both options give you free Upsolver units (UUs), units of processing capability per hour based on VM instance type.

Create a data source

1. Click on DATA SOURCES > NEW to connect to various data sources. Upsolver works with both streaming and static data sources.

2. Click on SELECT next to Amazon S3 data source. There are many data sources to choose from. The Quickstart uses pre-populated mobile data.

3. Choose BUCKET > upsolver-tutorials-mobile-users by clicking on the dropdown box. Fill in FOLDER > data and DATE PATTERN > yyyy/MM/dd/HH/ click on CONTINUE. Upsolver automatically verifies the files in the bucket and displays blue checkmarks next to the file name.

4. Upsolver automatically parses the data and displays a sample of the data that you’re working with. You can expand each record to view each event in a formatted view.

5. Click on CREATE to complete the Amazon S3 Data Source. Notice the schema is automatically generated and data statistics are also displayed.

Create a data output

1. Click on NEW OUTPUT on the upper right corner. (You can also get to the output screen by clicking on OUTPUTS > NEW)

2. Choose MySQL output by clicking on SELECT. We’re going to use MySQL as our output for this tutorial. You can output to any Upsolver data outputs.

3. Enter gaming as the name of the output. Leave everything as default and click on NEXT.

4. Expand the payload field. Add a field to your output by clicking on the + sign next to eventName. The field that we’re adding is. data.payload.eventName.

Data transformation

1. Rename the fields by clicking on the field under SCHEMA COLUMN and rename payload.eventname to event_name

2. Transform data.clientEventTime to from unix epoch to human readable format in UTC. Click on Add Calculated Field on the upper left and find the TO_DATE function, then click on SELECT. Enter data.clientEventTime in the DATETIME box and give the field a NAME called event_time. Click on PREVIEW to make sure the date looks right and click on SAVE.

3. Add event_time to your output by clicking on + next to the field.

4. Add a count for each event by clicking on Make Aggregated > OK > Add Aggregation. Click on SELECT next to COUNT (*) and click on SAVE.

Click over to the SQL tab on the upper right hand corner. Keep in mind all changes that are made in the SQL view will also be represented in the UI view and vice-versa. You can see the work from the UI automatically generated a SELECT statement. Click on PREVIEW to ensure data looks correct.

Connect to MySQL and start streaming

1. Click on RUN on the upper right corner and choose Create a new MySQL Connection from the MySQL Connection dropdown.

2. (Optional) Setup a RDS MySQL environment if you don’t have an existing environment.

If you don’t have an existing MySQL connection, you can either start up a MySQL instance following the instructions here or you can install it locally on your computer by following instructions here.

Make sure you have the correct inbound rule configured for the MySQL (RDS) security group. The inbound rule allowing all IPs (0.0.0.0/0) to access 3306 need to be created.

3. Define MySQL Connection in the following format and click on CREATE.

CONNECTION STRING (make sure the schema name exists in your MySQL environment): jdbc:mysql://database-2.eyb4e3j72ydseds.us-east-1.rds.amazonaws.com:3306/dev
USER NAME: <mysql username> (AWS RDS default username is admin)
PASSWORD: <mysql password>
NAME: <name of the connection>

4. Select OUTPUTS > gaming > Run and choose the connection that you just defined as MYSQL CONNECTION. Choose dev (make sure this database exists in your MySQL environment) under SCHEMA and TABLE NAME > gaming. Click on NEXT.

5. Use the sliding bar to only load from the data from last day click on DEPLOY. Keep in mind that ENDING AT set as Never means it's a continuous data stream. Click on OK if a warning appears.

6. Upsolver starts outputting the data to MySQL. You can keep track of the output status by clicking on PROGRESS. Wait for a moment for the data to output.

Verify your data in MySQL

PreviousAmazon Athena data output NextLocal MySQL data output

Last updated 4 years ago

Was this helpful?