Final project for CS 3654 Data Viz
Team Members:
| Name | PID |
|---|---|
| Peter Murphy | petermurphy |
| Joseph McAlister | josephrm |
| ??? | ??? |
Goal: You will conduct a complete data science project of your choosing. Your goal is to undertake the whole data science process:
- Define problem: Identify a problem topic of interest to your team, and identify some interesting research questions to ask about that topic.
- Collect Data: Gather a bunch of data on the topic that will help answer your research questions.
- Process Data: Clean and process the data into usable form to be able to answer the research questions.
- Visualize Data: View the data to gain insight into potential answers to the questions, and identify even deeper questions.
- Analyze Data: Analyze and simplify big data, construct models relevant to your research questions, and formalize your answers.
- Report: Report your results in written and visual form using QQQ.
-
Problem:
- Creative, interesting, and non-trivial questions, on a domain topic of interest to the team.
-
Data:
- Should involve data scraped or gathered from the web, or derived from a client problem.
- Should be a sufficiently large amount of data to answer interesting non-trivial questions.
- Should involve bringing together multiple kinds of data, to enable analysis of interesting novel relationships.
-
Process:
- Should involve some exploratory analysis, and should demonstrate creativity, depth, and thoroughness in your analysis.
- sShould make use of a variety of visual and analytical methods (it is not required that you use all the methods covered in class).
- Should formalize answers quantitatively.
-
Report:
- Should present insightful answers to interesting non-trivial questions.
- Should justify your findings with evidence.
- Should present findings clearly and convincingly. Should make use of visuals.
- Should use the QQQ format.
Project Part 1 focuses on the early steps 1-3 of the data science process above. You will report on your initial progress on defining the problem, and collecting and processing data for your Project. Next week you will move on to the later steps, but you will be able to revisit earlier steps as needed.
Be creative and find interesting topics and data to study. Ideas might involve: problems from another class you are taking or research project you are involved, news and events, social phenomena and social media effects, university rankings and their characteristics, tuition costs and value, demographic or transportation trends, government spending, election craziness, technology trends, sports analysis, music, finance and stock markets, product reviews, ... the list goes on and on.
Submittable
-
Jupyter Notebook Project1.ipynb and Project1.html report in draft QQQ format. The report will only contain the parts of the QQQ that are completed in Project Part 1, including::
- Title, Team name, team member names and PIDs
- Define the problem, description of topic, research questions you hope to answer, and other elements of Q1.
- Detailed description of the data you gathered and how you processed it, with annotated code associated with collecting and processing data, and any other relevant elements of Q2.
- Listing of credits of what each team member contributed to completing Project Part 1.
-
Processed data files (in compressed form if large), preferably in CSV format.
-
Any additional Jupyter Notebooks containing other material that you created but did not include in the primary notebook, such as ideas or code that didn't work out or you decided not to use.
Submit the materials on Canvas as a team (one submission for the whole team).