Course materials for General Assembly's Data Science course in San Francisco, CA (10/4/16 - 12/13/16).
Instructors: Sinan Ozdemir
Teaching Assistants: George McIntire Cari Levay
Course Times
Tuesday/Thursday: 6:30pm - 9:30pm
Office hours:
TBD
All courses / office hours will be held in the student center at GA, 225 Bush Street
| Tuesday | Thursday | Project Milestone | HW |
|---|---|---|---|
| 10/4: Introduction / Expectations / Intro to Data Science | 10/6: Pandas | ||
| 10/11: 10/13: APIs / Web Scraping 101 | 10/15: Intro to Machine Learning / KNN | HW 1 Assigned (W) | |
| 10/18: Model Evaluation / Linear Regression Part 1 | 10/20: Linear Regression Part 2 / Logistic Regression | Three Potential Project Ideas (W) | |
| 10/25: Natural Language Processing | 10/27: Naive Bayes Classification | HW 1 Due (W) | |
| 11/1: Advanced Sklearn (Pipeline and Feaure Unions) | 11/3: Review | ||
| 11/8: Decision Trees | 11/10: Ensembling Techniques | HW 2 Assigned (W) | |
| 11/15: Dimension Reduction | 11/17: Clustering / Topic Modelling | First Draft Due (W) | |
| 11/22: Stochastic Gradient Descent | 11/23: No Class | Peer Review Due (M) | |
| 11/29: Neural Networks / Deep Learning | 12/1 Recommendation Engines | HW 2 Due (W) | |
| 12/6: Web Development with Flask | 12/8: Projects | ||
| 12/13: Projects |
- Install the Anaconda distribution of Python 2.7x.
- Setup a conda virtual environment
- Install Git and create a GitHub account.
- Once you receive an email invitation from Slack, join our "SFDAT28 team" and add your photo!
- PEP 8 - Style Guide for Python
- Learn How to Think Like a Computer Scientist
- Potential book for course? :)
##Introduction / Expectations / Intro to Data Science
Agenda
- Introduction to General Assembly slides
- Course overview: our philosophy and expectations (slides)
- Ice Breaker
Break -- Command Line Tutorial
- Figure out office hours
- Intro to Data Science: slides
Homework
- Setup a conda virtual environment
- Install Git and create a GitHub account.
- Read my intro to Git and be sure to come back on monday with your very own repository called "sfdat28-lastname"
- Once you receive an email invitation from Slack, join our "SFDAT28 team" and add your photo!
- Introduction on how to read and write iPython notebooks tutorial
####Goals
- Feel comfortable importing, manipulating, and graphing data using Python's Pandas
- Be able to find missing values and begin to have a sense of how to deal with them
####Agenda
- Don't forget to
git pullin the sfdat26 repo in your command line - Intro to Pandas walkthrough here
- Pandas Lab 2 Solutions here
####Homework
- Go through the python class/lab work and finish any exercise you weren't able to in class
- Make sure you have all of the repos cloned and ready to go
- You should have both "sfdat28" and "sfdat28-lastname"
- Read Greg Reda's Intro to Pandas
- Take a look at Kaggle's Titanic competition
- I will be using a module called
tweepynext time.- To install please type into your console
pip install tweepy
- To install please type into your console
- Another Git tutorial here
- In depth Git/Github tutorial series made by a GA_DC Data Science Instructor here
- Another Intro to Pandas (Written by Wes McKinney and is adapted from his book)
- Here is a video of Wes McKinney going through his ipython notebook!
- Examples of joins in Pandas
- For more on Pandas plotting, read the visualization page from the official Pandas documentation.
-
Maria finds out that Sancho has been cheating on her with her.. mother!
-
We will use python to programatically obtain data via open sources on the internet
- We will be scraping the National UFO reporting center
- We will be collecting tweets regarding Donald Trump and Hilary Clinton
- We will be examining What people are really looking for in a data scientist..
-
We will continue to use pandas to investigate missing values in data and have a sense of how to deal with them