Relational databases are the backbone of data science and the language that we use to communicate with them is called SQL. The SQL test is a common component for data-adjacenemnt jobs within industry, government and the education sector. It is a useful tool that some argue spawned the field of data science. Before Big Data was a thing, Knowledge Discovery in Databases (KDD) used simple SQL queries to investigate and understand the nature of the large amounts of data that were being collected by governments and companies. The humble SQL test now torments the budding data scientist as a right of passage in the job search process.
- Be able to discuss an overview of relational datanases and the purpose of SQL
- Be able to spin up an AWS instance and load a SQL database into it
- Be able to connect to the database through RStudio/R using the DBI package
- Be able to run basic SQL commands in RStudio using the RMySQL package
- Log into your AWS Management Console
- Locate
RDSunder theDatabasesheading - Within Amazon RDS click
Create database - Under
Choose a database creation methodclickStandard Create - Under
Engine optionschooseMySQL - Under
TemplateschooseFree tier - Under
Settingsname yourDB instance identifierasdatabase-1 - Under
Credential settingscreate a username and password combination and write it down (you will need it later) - Under
ConnectivityexpandAdditional connectivity configurationto show additional menu items and make sure thatPublicly accessibleis checkedYes - Expand the
Additional configurationmenu - Under
Initial database namewriteoudb - Uncheck
Automatic backups - Click
Create database - Once the database is created, take a screenshot and add it to your repository
- Under
Security GroupsclickInboundand thenEdit - Add the rule
SQL/AuroraonPort 3306with theConnectionofMyIP
- sql-project.Rmd - connected the Rmd to AWS MySQL Database and practiced SQL commands.

