This project was part of WBS Coding School lessons , Content is also from WBS Coding School.

Amazon RDS: set up a cloud MySQL instance

  1. Set Up Your AWS Account: If you haven’t done so yet, create your AWS account.
  2. Log Into AWS Console: Use your account credentials to sign in to the AWS Console.
  3. Access RDS Service: Find the “Services” section and click on “RDS.”
  4. Create a Database: Look for the “Create a database” option and click on it.
  5. Choose MySQL Version: Select “MySQL” as your database type and pick the latest MySQL version, usually shown as “8.0.XX.”
  6. Select “Free Tier” Template: This step is crucial. Opt for the “Free tier” template, which will give you a small, slower, but cost-free database.
Image from WBS Coding School

7. Name Your Instance: Give your database instance a distinctive name, like “first-project3-db.”

8. Set Up Master User: Assign a username and password for the “master user.” This user often has all permissions and is commonly referred to as “admin.” Remember to jot down the password since it’s needed for your local MySQL instance. Make sure the password only contains letters and numbers.

9. Choose Instance Class: If you’re using the “Free tier” template, the default setting (typically “micro”) is suitable. If you’ve chosen a different instance class, ensure you’ve selected the “Free tier” option before proceeding.

Important steps to not cause extra charge on your account

  1. Disable Storage Autoscaling: Prevent any unexpected charges by turning off the default setting “Enable storage autoscaling.”
  2. VPC Settings: Leave the Virtual Private Cloud (VPC) settings unchanged.
Image from WBS Coding School
  1. Public Access and Security Group: Enable public access to your instance and create a new VPC security group with a recognizable name. Keep the default values for Availability Zone and Database port.
  2. Database Authentication: Stick with the default “Password authentication” for database authentication.
  3. Disable Automated Backups: To avoid additional charges, make sure to switch off the pre-selected option “Enable automated backups.”
  4. Create Database: Click the “Create database” button and give it a little time. Wait until the database status changes from “Creating” to “Active.”

By following these clear steps, you’ll be well-prepared to set up your AWS RDS instance. This will pave the way for a smoother, more efficient approach to storing and managing your data in the cloud.

Connect to your Amazon RDS MySQL instance

If you’re new to AWS and its terminology, it’s important to clarify some aspects. While AWS might refer to the process as “creating a database” when setting up an RDS instance, the accurate MySQL terminology indicates that you’ve essentially established an “instance.”

An “instance,” sometimes termed a “connection,” serves as a container that can house multiple databases or “schemas.” It’s worth noting that different SQL vendors (such as MySQL, SQL Server, PostgreSQL) employ varying terminologies. In this context, we’ll follow the sequence: Instance — Database — Table. Thus, your next step involves connecting to the instance you’ve set up. This will allow you to create a database complete with the necessary tables to store your project’s information. It’s important to remember that you can create additional databases within the same instance if needed.

To establish this connection, you’ll need the host address or “endpoint” of your instance. This can be obtained from AWS Console > RDS > Databases. Click on your specific database, and you’ll find an overview of its settings and status under the “Connectivity & security” tab. Copy the endpoint from here.

Image from WBS Coding School

Open MySQL Workbench and create a new connection:

  1. Click the small “+” icon on the MySQL home page to initiate the “Setup New Connection” wizard.
  2. Assign a name to this connection — this is simply for your identification purposes.
  3. Paste the instance endpoint copied from your AWS Console into the “Hostname” field.
  4. Keep the port as 3306, which is the standard for MySQL.
  5. The username should be “admin” unless you altered it during the database creation.
  6. For the password, select “Store in Keychain” or “Store in Vault,” and then enter the password you set when initially creating the AWS RDS instance in RDS. This is distinct from the password used to connect to your local server.

With these settings in place, you can proceed with testing the connection.

Image from WBS Coding School

Upon successful setup, a new MySQL connection, bearing the name you chose, will appear on the MySQL Workbench Home screen. Now, clicking on this connection enables you to connect to it, similar to how you’d connect to your Local or “root” instance. If everything is working as intended, you’ll be presented with a query editor where you can craft your initial database and tables.

You’re now ready to proceed with creating a database and tables within it. If you’d like to insert data, you can use the following code as an example

CREATE DATABASE test_1;
USE test_1;

CREATE TABLE test_table (
test_id int AUTO_INCREMENT,
FirstName varchar(255),
City varchar(255),
PRIMARY KEY (test_id)
);

INSERT INTO test_table (FirstName, City)
VALUES
(“Joan”, “Barcelona”),
(“Tim”, “Berlin”);

SELECT * FROM test_table;

Assuming everything has gone smoothly, you’re all set. Should the connection encounter issues, don’t worry — just proceed to the next step. At this point, avoid dropping the test database and table, as they’ll prove useful for testing AWS Lambda in the next lesson.

Finally, allow all traffic to your database. It’s possible that you might face difficulty establishing a connection, or the connection might falter when attempting to connect from different locations (as you’ll do in the next lesson). It’s sensible that there are initial restrictions for security reasons, even in the cloud. However, for your current project, you’ll want to enable all traffic both in and out of the database.

Remember when you created a new “VPC security group” during the initial database setup? Your instance is now linked to this security group. Should you wish to adjust access permissions for the instance, you can edit or add rules to this security group. Follow these steps:

  1. In your AWS console, navigate to RDS > Databases.
  2. Click on the instance you’ve created.
  3. Under the “Connectivity & security” tab, locate and click on the VPC security group.
Image from WBS Coding School

4. Click on the “Security group ID” linked to that group.

5. Choose “Edit inbound rules.”

6. Click “Add rule.”

7. Fill in the fields as shown in the screenshot provided. It’s perfectly fine if the text in the Source column changes from “Custom” to something else like “0.0.0.0/0” once you input it. It should revert to “Custom” once the rule is saved

8. Confirm by clicking “Save rules.”

And there you have it! Should you create another RDS instance in the future and require open access, you can use this existing VPC Security Group. This practice is quite common in AWS: rather than setting individual security settings and permissions for every user and service, you establish groups of rules (Security Groups), groups of policies (Roles), and groups of users, then link them together.

With that, get ready to insert data into the cloud database — right from the cloud itself!

Yes you got it right !! Serverless Computing

Image from Xenostack

It’s like having a genie 🧞‍♂️that grants your coding wishes without you having to worry about any technical stuff. All you need to do is write your script, tell the genie 🧞‍♂️ which programming language you’re using, and boom! Your code runs smoothly. No fretting about hardware or operating systems — the genie handles all that behind the scenes.

Of course, even with magic, there’s a bit of setup involved. Think of it like preparing your workspace before you start crafting something amazing. You need to make sure that the libraries and tools you’re using in your script are readily available. After all, even genies 🧞‍♂️ need to know what spells to cast!🧙‍♀️

Now, in the realm of Amazon Web Services (AWS), this magical genie🧞‍♂️ goes by the name “AWS Lambda.” It’s like your very own programming assistant. You hand it your code, tell it what language you’re speaking (programming-wise), and it takes care of executing your script. All those nitty-gritty details like hardware, operating systems, and server management? Lambda’s got them covered.

So, next time you’re envisioning your dream code running effortlessly, just remember the power of serverless computing, embodied in AWS Lambda. It’s like having your own coding genie — granting your wishes with a simple script and a sprinkle of technical magic.

AWS Lambda

Image from allcode

Think of AWS Lambda like a tool that only understands “functions.” These are snippets of code that do specific tasks. So, when we talk about AWS Lambda, we often call these functions “Lambda functions.” But hold on, these are not the same as Python’s tiny “lambda functions.” Those are something completely different!

Now, the cool thing is, you can write Lambda functions using various programming languages, including Python. But remember, these are not those little Python lambda things — these are full-fledged functions that can do all sorts of tasks.

Now, here’s the word you’ll hear a lot: “triggers.” It’s like a cue that tells your Lambda function to start running. Think of it like a trigger that sets off a chain reaction. For example, imagine an app where people upload pictures of things they want to buy online. When they upload a picture, a Lambda function can be triggered to kickstart the process of finding that item in online stores.

So, Lambda functions are these smart pieces of code that get triggered by events, which can be anything from someone uploading a picture to some other action happening in the digital world. It’s like setting up a bunch of little helpers that jump into action whenever something happens. Cool, right?

Move your scripts to cloud

Before we dive into making our very first Lambda function, there’s a preliminary step we need to handle — setting up what’s known as a “role.” This role acts like a permission slip that grants our Lambda function access to our RDS instance. Permissions in the cloud might seem a bit pesky, but they’re super important.

Now, typically, the best approach is to grant only the bare minimum permissions required. But for learning purpose follow this

Image from WBS Coding School learning platform

Create Role
follow these steps to get your 
role set up:

  1. Sign in to AWS Console: Get started by logging into your AWS console.
  2. Find IAM: Use the search bar at the top and look for “IAM.” When you spot it, give it a click.
  3. Roles Section: On the left-side menu, you’ll see “Roles.” Give it a click too.
  4. Create Role: Now, hit that “Create role” button.
  5. Type of Trusted Entity: For the trusted entity type, choose “AWS service.”
  6. Use Case: Select “Lambda” as your use case.
  7. Moving Forward: Click “Next” to move along.
  8. Policy Selection: In this step, check the box next to the “AdministratorAccess” policy.
  9. Advancing to the Next Step: Click “Next” to proceed.
  10. Role Name: Time to name your role. Let’s go with “LambdaAdminAccess.”
  11. Creation Time: Now, just tap that “Create role” button.

And there you have it — you’ve successfully created the role. This role is like a permission ticket that’ll let your Lambda function connect with different services. Whether you’re using your personal AWS account or an instructor’s setup, you’re all set for the next steps.

Create Lambda function

Follow these steps to create a test function:

  1. Sign in to AWS Console: Start by logging into your AWS console.
  2. Access Lambda Service: Navigate to “Services” and choose “Lambda.”
  3. Create a Function: Click on “Create function.”
  4. Starting From Scratch: Opt for “Author from scratch.”
  5. Name Your Function: Give your function a name that makes sense to you. For testing purposes, something like “test” is recommended.
  6. Choose Runtime: Pick “Python 3.9” from the Runtime options.
  7. Setting Permissions: In the “Permissions” section, click on “Change default execution role.” Then, select “Use an existing role” and choose the “LambdaAdminAccess” role you created earlier.
  8. Creation Time: Click on “Create function” to bring your Lambda function to life.

With these steps, you’ve successfully set up your first Lambda function.

Connect Lambda function to RDS instance

Scroll down on your Lambda function dashboard to find “Code Source.” Click on lambda_function.py to reveal some code.

Image from WBS Coding School learing platform

Here, we’ll create a function that connects to the RDS instance we set up earlier. It inserts a data row into our testing table. Paste the following code, making sure to replace user, password, and host with the appropriate parameters for your cloud instance:import json
import pandas as pd

def lambda_handler(event, context):
schema = “test_1”
host = “your-instance-endpoint”
user = “admin”
password = “your-instance-password”
port = 3306
con = f’mysql+pymysql://{user}:{password}@{host}:{port}/{schema}’

df = pd.DataFrame({‘FirstName’: [‘Lambda’], ‘City’: [‘Cloud’]})

df.to_sql(‘test_table’, if_exists=’append’, con=con, index=False)

return {
‘statusCode’: 200,
‘body’: json.dumps(‘Hello from Lambda!’)}

To test the Lambda function:

  1. Click “Deploy” to save the code.
  2. Click “Test” to create a “test event.”
  3. Give the test event a name (it doesn’t matter).
  4. Click “Save.”
  5. Click “Test” again (expect an error).
If you see an error like the one below, it’s because the required modules aren’t available. We’ll fix that 
next.Response
{
"errorMessage": "Unable to import module 'lambda_function': No module named 'pandas'",
"errorType": "Runtime.ImportModuleError",
"requestId": "9fc15990-0df9-4de9-8a31-1236596e3ed1",
"stackTrace": []
}

Uploading Python Modules to Lambda

Option 1: Using AWS Data Wrangler + KLayers

  1. Go back to your Lambda function in the AWS console.
  2. Scroll to the bottom, click “Add a layer.”
  3. Select AWS Layers, choose AWSSDKPandas-Python39, and the latest version.
  4. Add this layer.
  5. Visit Keith Rozario’s GitHub repository.
  6. Find the ARN for SQLAlchemy that matches your Python version.
  7. In your Lambda function, add another Layer and select “Specify an ARN.” Paste the SQLAlchemy ARN from Keith’s repository.

Option 2: Creating Layers Manually

  1. Create a conda environment with required packages.
  2. Open your Anaconda PowerShell or Terminal.
  3. Create the environment: conda create --prefix /path-to-your-project-folder/mysql-req-env python=3.8
  4. Activate the environment: conda activate /path-to-your-environment/mysql-req-env
  5. Install packages: pip install requests and pip install mysql-connector-python (or use conda).
  6. Locate package folders under lib > python3.8 > site-packages.
  7. Copy contents to a new “python” folder and compress it.
  8. Back in your Lambda function, add a custom layer and upload the compressed “python.zip” file.
  9. Choose “Python 3.8” for Compatible runtimes.
  10. Create the layer and attach it to your Lambda function.

With these steps, your Lambda function should be ready to execute your code, and it has the necessary modules to connect with your RDS instance.

Test your Lambda function

The test results should look like the screenshot below. We are not expecting any response, so “null” is a good outcome here. If you see an error, it is possible your Lambda function could not connect to the database. Review the connection details and make sure you followed correctly all the steps above for allowing the connection.

Image by WBS Coding School learning platform

More importantly, the best way to check whether the test worked is to connect to the database on MySQL workbench, select all the rows from the table test_table and look for a row with the values you inserted from Lambda:

If this worked, you are ready to bring all of your code for the different data collection sources to new Lambda functions! Remember to adapt your code to the structure of the Lamdba functions, with the event and context arguments (even if you don’t use them). If your scripts include other packages (like beautifulsoup) you will have to create and add new layers.

Troubleshooting:

1. Increase the timeout

Either now or, most likely, when you build a Lambda function that connects to an API, you might an error message that reads like that: Task timed out after 3.01 seconds. Here we explain why it appears and how to solve it.

Why?

AWS charges you for each millisecond the code in Lambda is being executed. If you left a Lambda function running for too long, the costs could be huge. This is why, by default, it limits the time the function is allowed to run. This limitation is called timeout.

How to solve it?

To increase the timeout, go to your Lambda function on the AWS Console and follow these steps (shown on the screenshots below):

Click on “Configuration”

Click on “General configuration”

Click on “Edit”

Change the Timeout from 3 seconds to a greater value (between 30 seconds and 5 minutes should be more than enough).

Click “Save”

2. Use a return statement

Your Lambda functions are not supposed to return anything on the testing console when they’re executed: you only need them to (silently) retrieve data from APIs and insert it.

However, if you are experiencing errors and you don’t know why (for example, if the Lambda function runs without errors but no data is inserted into the database), a good strategy when debugging is to ask the function to return certain information as it runs the code.

Returning, for example, the response code from an API call (a 200 means the call went through) and some of the data that you think you should have received from it are good ways to rule out the API calls as causes of the error.

3. Start over

As you have seen, setting up cloud services is a lengthy process. You might have missed one of the dozens of steps along the way. Try to closely check you followed the process. Some usual suspects are:

  • You forgot to add inbound rules that allow access to your database
  • You have created your Lambda function and the layers in different AWS Regions.
  • You have entered the wrong credentials to connect to the RDS MySQL instance (the default user for the RDS instance is admin, not root!).

If you have thoroughly checked everything you can think of and the Lambda function still does not work as expected, start over. Delete the function and create a new one. A fresh start, in the cloud as well as on earth, can do wonders!

Automate the data pipeline

You’re making great progress!

At this point, you should have accomplished the following:

  1. An AWS RDS instance containing all the relevant tables and potentially supplementary tables to establish connections
  2. Developed multiple Lambda functions capable of gathering data from the internet. These functions can acquire data through methods like web scraping or sending API requests, and subsequently insert this data into your RDS cloud instance.

Up until now, you’ve primarily been running your Lambda functions during testing. To ensure that your code runs automatically and on demand, you need to implement trigger events. In this scenario, we will schedule these events using AWS EventBridge.

Our main goal is to schedule your project’s functions for daily data collection related to tomorrow’s weather and flight arrivals. To learn AWS EventBridge, it’s a good idea to use a test Lambda function that doesn’t require API calls, ensuring you save your free request quota. You can repurpose the existing test Lambda function you made here by inserting data into a test MySQL table or create a new function with an easy-to-monitor task.

You’ll need to refer to AWS documentation here 👈

There is tutorial that has more explanation about Chron expressions. The AWS EventBridge service used to be called CloudWatch, but the structure is basically still the same:

Hope you have enjoyed this reading .
BYE 👋

By Apoorva