Skip to content

Latest commit

 

History

History
161 lines (109 loc) · 8.62 KB

File metadata and controls

161 lines (109 loc) · 8.62 KB

QTM 340: Current Class-by-Class Schedule

Introduction and Overview

8/25 - What can you do with text?

  • In class: syllabus overview, intro/transcription exercise, Voyant

8/30 - What should you do with text?

9/1 - Intro to Python, Colab

  • Before class:
    • HW0 due: Video intro (on Canvas)
  • In class: GPT-2 exercise; intro to Python, Colab

Unit 1: Turning Text into Data

9/6 - Web Scraping

  • Before class:
    • HW1 due: Strings, lists, and dictionaries
  • In class: web scraping and HTML parsing using BeautifulSoup

9/8 - APIs

9/13 - No class meeting, professor at Turing Institute

9/15 - Text Parsing / Regular Expressions

9/20 - The People Behind the Text

Unit 2: Introductory Data Science with Text

9/22 - Sentiment Analysis

9/27 - Natural Language Processing (NLP) 101

9/29 - Turning Words into Numbers

10/4 - No class meeting, professor at conference

  • Quiz 2 due: Sentiment analysis of Yelp reviews

10/6 - Data and Context

[ FALL BREAK ]

Unit 3: Modeling Text as Data I

October 13 - Topic Models

October 18 - Word Embedding Models

  • Before class:
    • Optional (will be discussed in class): Lauren Klein and Sandeep Soni, “How Words Lead to Justice”; Laura K. Nelson, “Leveraging the Alignment Between Machine Learning and Intersectionality” (Canvas)
    • HW4 due: word embeddings
  • In class: word embeddings, discussion of papers

October 20 - Pandas, Papers

  • Before class:
    • Quiz 3 due: Exploratory research exercise
  • In class: Pandas, paper catchup, more project brainstorming (research questions)

October 25 - Guest Lecture, Lucy Li

October 27 - Project Brainstorming

  • Before class:
    • Final project prep (FPP) #1 due: Datasheet
  • In class: more project brainstorming (methods)

Unit 4: Modeling Text as Data II

November 1 – Classification, day 1

  • Before class:
    • No reading or homework for this class meeting; start working on your project proposals!
  • In class: classification

November 3 – No class meeting, professor at conference

  • Final project prep (FPP) #2 due: Project proposal

November 8 – Classification, day 2

November 10 – Clustering

  • Before class:
    • Read: Ben Schmidt, "Genre, Manifolds, and AI"; Matthew Wilkens, "Genre, Computation, and the Varieties of 20th Century U.S. Fiction" (Canvas)
  • In class: clustering

November 15 - BERT (Bidirectional Encoder Representations from Transformers), day 1

  • Before class:
  • In class: sentiment analysis with BERT and next sentence prediction

November 17 – BERT, day 2

  • Before class:

    • FPP #3 due: Final project first pass
  • In class: classification with BERT

[ THANKSGIVING BREAK ]

Unit 5: Final Projects and Course Wrap-Up

November 30 – Project Presentations

December 2 – Project Presentations

December 7 – Course wrap-up and assessment

FINAL PROJECTS DUE DECEMBER 8TH, 5:30PM ET

This syllabus draws from previous iterations of QTM 340 taught by myself and Dan Sinykin. It also incorporates materials and resources developed by Melanie Walsh, Jinho Choi, Alison Parrish, David Mimno, David Bamman, Ryan Cordell, and Ben Schmidt, as well as suggestions and other input from Heather Froehlich, Ted Underwood, Jacob Eisenstein, Jim Casey, Taylor Arnold, Lauren Tilton, Lisa Rhody, Eileen Clancy, and the Colored Conventions Project Team.