We were hired by Ironhack to perform an analytics consulting project to understand Ironhack's competitive landscape: which other coding schools are there and what drives their success or lack thereof relative to Ironhack.
Our mission is to design, create and populate an appropriate database with information about coding schools that are our competition, as well as design suitable queries that answer business questions of interest. The first one being the course we could launch according to market trends and demand. The second question we focused on is relative to the country where we could set up our next campus.
- Understand the context and environment that Ironhack evolves in to determine our goal.
In this first part, we analysed our main competitors. We tried to understand the courses they offer, the countries they are located in and their reviews.
- Decision making process
In order to gother more data, we used web scrapping to compare informations from Switch Up and Course Report. We also looked for relevant dataset which could support our analysis.
- Results
Competition wise, we compared Ironhack with schools having more than 2000 reviews, more than 4.7 as a rating average and offering more than 5 courses. We could clearly identify Le Wagon as our number competitor and decided to have a look at their strategy. By looking closely at the bootcamp offered we realized that "Data Science" was very popular according to the comments and to Google Trends.
THE COURSE WE SHOULD LAUNCH IS DATA SCIENCE.
Data shows that it could attract leads/students, develop our course content and improve the satisfaction score.
Location wise, we started with EUROPE. We based our analysis on 4 criterias: the size of the market (active population), the expected growth in the high-tech economy by 2035, the expected share of employment with above basic digital skills in 2035 and lastly the GDP of the country. Through a weighted score approach on MySQL, we obtained the following ranking:
- SWEDEN
- IRELAND
- FINLAND
Comparing it to our competition, we realized those had little to no presense and would definitely be a great match.
Outside of Europe, our goal would be to target the APAC region and more specificaly India due to it's market size and the fact that there is no competition there. However, we are aware that it would be risky and require a search for a local partnership, a local team and a pricing analysis.
Choosing the right primary key in our SQL tables. Creating the ER Diagram with those primary and foreign keys. Finding the relevant data for our analysis (web scrapping Course Report)
The notebook attached to the project connects to a bootcamp review website (www.switchup.com) and scrapes some information into dataframes. This will be the basis of the information to design your database. Read the script and get a general understanding each function. Comment the code appropriately.
-
Populate the list of schools with a wider variety of schools (how are you going to get the school ID?)
-
Take a look at the obtained dataframes. What dimensions do you have? What can work as useful metrics? What keys do you have? How could the different dataframes be connected?
-
Go back to the drawing board and try to create an entity relationship diagram for tables available
-
Once you have the schemas you want, you will need to:
- Create the suitable SQL queries to create the tables and populate them
- Run these queries using the appropriate Python connectors
- Crucial hint: check out the following tutorial: https://www.dataquest.io/blog/sql-insert-tutorial/
We will henceforth list the requirements for each project in three groupings to help you prioritize your work
-
MVP (Minimum Viable Product): these are the absolute minimum requirements that you will have to achieve for your project to be considered completed. They should absolutely be your priority as failure to meet these requirements means an insufficient delivery, even if you go above and beyond on other requirements. Plan around unforesseable situations to make sure you have time to at least deliver the MVP. A good way of doing this is to plan on having the MVP well in advance of the deadline for the project.
-
Expected improvements: these are suggestions on how to improve your product, features that are not critical but that we expect most students to be able to deliver some of these features. They will often be stated in more open-ended description so that you can customize and differentiate your project and make it a tailored part of your portfolio.
-
Nice-to-haves: these are suggestions on how to go above and beyond. We do not expect your products to contain these features / use these technologies (but we will not actively discourage you from pursuing them as well). The nice-to-haves exist more to help you find resources that may not be taught in class and put some icing on your product, potentially even after the bootcamp.
The Deliverables for this project are:
[ ] Files that contain your solution submitted via a GitHub repo
-
.py or .ipnby files to extract and transform the data scraped in the attached notebook as well as running the business analysis
-
An exported .sql file with the final schema
-
A README.md file with explanation fo the project goals, methodology and ERD
[ ] A presentation that showcases your product
-
The presentation includes a business analysis built on top of your database where clear business hypotheses should be tested and some actionable conclusion must be presented
-
The presentation includes a component about design choices for your database, with at least a presentation of the final ERD
-
The presentation includes a component about technical challenges faced
[ ] Additional depth in business analysis
-
Deeper data gathering: more of the same datapoints (schools, locations, comments) AND/OR different data points (prices, recommendations, etc)
-
Enriching data gathering: more sources of data (e.g. demographics by city, salaries per country etc.)
-
Multi-layered questions: use your answers to basic hypotheses to generate more refined hypotheses (which may require more sophisticated scraping/ETL)
-
Charting: use visual intuition to drive your analysis
[ ] Improved engineering and design of your solution
-
Deployment of the solution to a cloud database
-
Creation of auxiliary functions that test the database for data quality issues
[ ] Improved engineering of solution
-
Encoding of primary key - foreign key relation in database design
-
Differential update of database (include only most recent data when you re-run the script)