Skip to content

giuperrotti/sfdat28

 
 

Repository files navigation

SF DAT 28 Course Repository

Course materials for General Assembly's Data Science course in San Francisco, CA (10/4/16 - 12/13/16).

Instructors: Sinan Ozdemir

Teaching Assistants: George McIntire Cari Levay

Course Times

Tuesday/Thursday: 6:30pm - 9:30pm

Office hours:

TBD

All courses / office hours will be held in the student center at GA, 225 Bush Street

Course Project Information

Course Project Examples

Tuesday Thursday Project Milestone HW
10/4: Introduction / Expectations / Intro to Data Science 10/6: Pandas
10/11: 10/13: APIs / Web Scraping 101 10/15: Intro to Machine Learning / KNN HW 1 Assigned (W)
10/18: Model Evaluation / Linear Regression Part 1 10/20: Linear Regression Part 2 / Logistic Regression Three Potential Project Ideas (W)
10/25: Natural Language Processing 10/27: Naive Bayes Classification HW 1 Due (W)
11/1: Advanced Sklearn (Pipeline and Feaure Unions) 11/3: Review
11/8: Decision Trees 11/10: Ensembling Techniques HW 2 Assigned (W)
11/15: Dimension Reduction 11/17: Clustering / Topic Modelling First Draft Due (W)
11/22: Stochastic Gradient Descent 11/23: No Class Peer Review Due (M)
11/29: Neural Networks / Deep Learning 12/1 Recommendation Engines HW 2 Due (W)
12/6: Web Development with Flask 12/8: Projects
12/13: Projects

Installation and Setup

Resources

##Introduction / Expectations / Intro to Data Science

Agenda

  • Introduction to General Assembly slides
  • Course overview: our philosophy and expectations (slides)
  • Ice Breaker

Break -- Command Line Tutorial

  • Figure out office hours
  • Intro to Data Science: slides

Homework

  • Setup a conda virtual environment
  • Install Git and create a GitHub account.
    • Read my intro to Git and be sure to come back on monday with your very own repository called "sfdat28-lastname"
  • Once you receive an email invitation from Slack, join our "SFDAT28 team" and add your photo!
  • Introduction on how to read and write iPython notebooks tutorial

Class 2: Introduction to Pandas

####Goals

  • Feel comfortable importing, manipulating, and graphing data using Python's Pandas
  • Be able to find missing values and begin to have a sense of how to deal with them

####Agenda

  • Don't forget to git pull in the sfdat26 repo in your command line
  • Intro to Pandas walkthrough here

####Homework

  • Go through the python class/lab work and finish any exercise you weren't able to in class
  • Make sure you have all of the repos cloned and ready to go
    • You should have both "sfdat28" and "sfdat28-lastname"
  • Read Greg Reda's Intro to Pandas
  • Take a look at Kaggle's Titanic competition
  • I will be using a module called tweepy next time.
    • To install please type into your console pip install tweepy

Resources:

  • Another Git tutorial here
  • In depth Git/Github tutorial series made by a GA_DC Data Science Instructor here
  • Another Intro to Pandas (Written by Wes McKinney and is adapted from his book)
    • Here is a video of Wes McKinney going through his ipython notebook!
  • Examples of joins in Pandas
  • For more on Pandas plotting, read the visualization page from the official Pandas documentation.

Next Time on SFDAT28...

  • Maria finds out that Sancho has been cheating on her with her.. mother!

  • We will use python to programatically obtain data via open sources on the internet

    • We will be scraping the National UFO reporting center
    • We will be collecting tweets regarding Donald Trump and Hilary Clinton
    • We will be examining What people are really looking for in a data scientist..
  • We will continue to use pandas to investigate missing values in data and have a sense of how to deal with them

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 100.0%