- Project Description
- Project Goals
- Data Workflow
- Requirements & Deliverables
- Mentoring
- Schedule
- Presentation
- Tips & Tricks
- Resources
The objective of this project is choosing a topic that interests you, finding related data, completing an end-to-end analysis and presenting your findings all by yourself. Understand what research has already been done in that area and find questions that are still unanswered. Then, look for data that you think could help you answer those questions and analyze it using visualizations to support your reasoning. Finally, present your result in a presentation and a technical paper.
When choosing your topic remember that this project allows you to use data from your colleagues' past projects, BUT NOT YOURS!
During this project you will:
- Research and analyze data related to a topic of your interest.
- Apply the statistical techniques you have learned.
- Create useful and easily understandable plots.
- Communicate the results of your analysis clearly, accurately and engagingly.
- Learn to adapt your communication style to your audience.
## Data Workflow In this project you will focus on Data Analysis, but you will continue to develop your Data Wrangling and Visualization skills.
The mandatory requirements that this project needs to satisfy are:
- The project must be planned. That is why creating a Kanban board is important. You can find a template for Trello here. Remember that you CAN'T CODE until your project is planned.
- Your repository must be clean and organized; this means that it must include a .gitignore file and a README file and also have a functional file structure.
- The project needs to be presented to your colleagues on the day of the presentation.
The mandatory deliverables that you must turn in are:
- Link to the repository you used while working on your project. The repository must include all the files you used to complete your analysis. Remember to commit often to avoid trouble in case you mess up: this means more than 1 commit!
- Link to Trello or picture of your Kanban Board. Include the link or picture in the README file.
- A short technical paper illustrating your analysis in whatever format you prefer (e.g. Medium article, Jupyter notebook, etc.)
One of the TAs will be your mentor!
Your mentor will:
- Keep track of your project in general terms. Your mentor will be the second person that knows more about the project, after you.
- Check if you are following your plan: are you keeping up with your tasks and deadlines? Do you have any obstacles blocking you?
- Help/support you with specific questions.
Your mentor is not meant to:
- Know everything.
- Be your manager. You have to be responsible of your own tasks!
Please note that the following schedule is simply a guideline. Feel free to organize your work as you see fit.
Wednesday
- Choose a topic for your project.
- Find interesting questions related to your topic.
- Brainstorm to find out what kind of data you can use to answer those questions.
- Research and look for the data you need.
- Fork the project repository and edit the README overview. You can find a template for your README file in this repository. Remember to keep the README up-to-date.
Thursday
- Plan your project. Remember that we are providing you with a Trello template. Remember that you CAN'T CODE until your project is planned.
- CHECKPOINT: Topic and general idea validation with the Lead Teacher and TAs at 4PM.
- Once you finish, start coding!
Thursday - Friday
- Clean your data.
- Start working on your analysis and plots.
- Prepare a draft of the non-analysis slides of your presentation: title, motivation, context, etc.
Monday
- CHECKPOINT: General follow-up and resolution of eventual doubts/blockers.
Tuesday
- Complete the analysis and slides.
Thursday
- General rehearsal at 03:00 PM. Listen to the feedback you will receive and use it to improve your presentation.
- Final improvements!
Friday
- Presentations time at 03:00 PM!
Presentations for this project will be in the auditorium! Presentations will be EXACTLY 3 minutes long, with 2 additional minutes for questions. We will stop you at 3 minutes so make sure to rehearse your presentation to avoid exceeding the time limit!
The audience will evaluate the presentation by indicating how well they understood what you were trying to explain and how you presented it. This information will help you in further presentations!
- Organize your work (don't get lost!).
- Remember that Google has the answers to most of your questions. Should you not be able to find the answer on your own the TAs are here to help you.
- Define a simple approach first. You never know how the data can betray you.
- Learn about your subject and understand what other research has been done about the topic.
- Think about what kind of concept or information you want to convey before using a visualization.
Here are some data sources that could be interesting to you: