QTM340-Fall22/docs/schedule.md at main · laurenfklein/QTM340-Fall22

QTM 340: Current Class-by-Class Schedule

Introduction and Overview

8/25 - What can you do with text?

In class: syllabus overview, intro/transcription exercise, Voyant

8/30 - What should you do with text?

Before class:
- Read: Farhad Manjoo, "How Do You Know a Human Wrote This?"
- Read: Emily M. Bender and Timnit Gebru et al., “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?”
- Spend at least 30 minutes playing AI Dungeon
In class: Discussion of readings

9/1 - Intro to Python, Colab

Before class:
- HW0 due: Video intro (on Canvas)
In class: GPT-2 exercise; intro to Python, Colab

Unit 1: Turning Text into Data

9/6 - Web Scraping

Before class:
- HW1 due: Strings, lists, and dictionaries
In class: web scraping and HTML parsing using BeautifulSoup

9/8 - APIs

Before class:
- Read: Xavier Adam, “An Illustrated Introduction to APIs” and “API Whispering 101”
- HW2 due: APIs
In class: Scraping song lyrics using the Genius API

9/13 - No class meeting, professor at Turing Institute

9/15 - Text Parsing / Regular Expressions

Before class:
- Read: David Zentgraf, "What Every Programmer Absolutely, Positively Needs to Know about Encodings and Character Sets to Work with Text"; Aditya Mukerjee, “I Can Text You A Pile of Poo But I Can’t Write My Name”
- Optional but interesting: Miriam Sweeney and Kelsea Whaley, “Technically White: Emoji Skin-Tone Modifiers as American Technoculture”
In class: text parsing and regex with song lyrics

9/20 - The People Behind the Text

Before class:
- Watch: Andrew Norman Wilson, "Workers Leaving the Googleplex"
- Optional (will be covered in class): Lilly Irani, “Justice for ‘Data Janitors’”; Ishan Misra et al., "Seeing Through the Human Reporting Bias"
- Quiz 1 due: Creating a dataset using an API
In class: Discussion of readings

Unit 2: Introductory Data Science with Text

9/22 - Sentiment Analysis

Before class:
- Read: Ethan Reed, “Poems with Pattern and VADER, Part 1: Quincy Troupe and Part 2: Nikki Giovanni"; Maria Antoniak et al, “Narrative Paths and Negotiation of Power in Birth Stories”
In class: sentiment analysis

9/27 - Natural Language Processing (NLP) 101

Before class:
- Read: Leonardo Nicoletti and Sahiti Sarva, “When Women Make Headlines”; Maarten Sap et al., “Connotation Frames of Power and Agency in Modern Films”
In class: word counts, n-grams, lexicons

9/29 - Turning Words into Numbers

Before class:
- Read: Matt Daniels, “The Language of Hip Hop”; Sara Key, “Yelp Reviewers’ Authenticity Fetish is White Supremacy in Action”
- HW3 due: NLP 101
In class: intro to scikit-learn and TF-IDF

10/4 - No class meeting, professor at conference

Quiz 2 due: Sentiment analysis of Yelp reviews

10/6 - Data and Context

Before class:
- Read: Catherine D’Ignazio and Lauren Klein, “The Numbers Don’t Speak for Themselves,” from Data Feminism; Timnit Gebru et al., “Datasheets for Datasets”
In class: Discussion of data and context, final project brainstorming session (datasets)

[ FALL BREAK ]

Unit 3: Modeling Text as Data I

October 13 - Topic Models

Before class:
- Read: Lucy Li and David Bamman, “Gender and Representation Bias in GPT-3 Generated Stories”; Richard Jean So, “Consecration: The Canon and Racial Inequality,” from Redlining Culture (Canvas)
In class: topic modeling

October 18 - Word Embedding Models

Before class:
- Optional (will be discussed in class): Lauren Klein and Sandeep Soni, “How Words Lead to Justice”; Laura K. Nelson, “Leveraging the Alignment Between Machine Learning and Intersectionality” (Canvas)
- HW4 due: word embeddings
In class: word embeddings, discussion of papers

October 20 - Pandas, Papers

Before class:
- Quiz 3 due: Exploratory research exercise
In class: Pandas, paper catchup, more project brainstorming (research questions)

October 25 - Guest Lecture, Lucy Li

Before class:
- Read: Lucy Li and David Bamman, “Characterizing English Variation across Social Media Communities with BERT”
In class: Guest lecture, Lucy Li, UC Berkeley

October 27 - Project Brainstorming

Before class:
- Final project prep (FPP) #1 due: Datasheet
In class: more project brainstorming (methods)

Unit 4: Modeling Text as Data II

November 1 – Classification, day 1

Before class:
- No reading or homework for this class meeting; start working on your project proposals!
In class: classification

November 3 – No class meeting, professor at conference

Final project prep (FPP) #2 due: Project proposal

November 8 – Classification, day 2

Before class:
- Read: Catherine D’Ignazio et al., “Feminicide and Machine Learning”; Terra Blevins et al., “Automatically Processing Tweets from Gang-Involved Youth: Towards Detecting Loss and Aggression”; Dan Sinykin, “How Capitalism Changed American Literature”
In class: discussion of papers

November 10 – Clustering

Before class:
- Read: Ben Schmidt, "Genre, Manifolds, and AI"; Matthew Wilkens, "Genre, Computation, and the Varieties of 20th Century U.S. Fiction" (Canvas)
In class: clustering

November 15 - BERT (Bidirectional Encoder Representations from Transformers), day 1

Before class:
- Revisit Li and Bamman (from 10/13 class meeting); Hoyt Long, “Learning to Live with Machine Translation” (PDF); Suchin Gururangan et al., “Whose Language Counts as High Quality?”
In class: sentiment analysis with BERT and next sentence prediction

November 17 – BERT, day 2

Before class:
- FPP #3 due: Final project first pass
In class: classification with BERT

[ THANKSGIVING BREAK ]

Unit 5: Final Projects and Course Wrap-Up

November 30 – Project Presentations

December 2 – Project Presentations

December 7 – Course wrap-up and assessment

FINAL PROJECTS DUE DECEMBER 8TH, 5:30PM ET

This syllabus draws from previous iterations of QTM 340 taught by myself and Dan Sinykin. It also incorporates materials and resources developed by Melanie Walsh, Jinho Choi, Alison Parrish, David Mimno, David Bamman, Ryan Cordell, and Ben Schmidt, as well as suggestions and other input from Heather Froehlich, Ted Underwood, Jacob Eisenstein, Jim Casey, Taylor Arnold, Lauren Tilton, Lisa Rhody, Eileen Clancy, and the Colored Conventions Project Team.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QTM 340: Current Class-by-Class Schedule

Introduction and Overview

8/25 - What can you do with text?

8/30 - What should you do with text?

9/1 - Intro to Python, Colab

Unit 1: Turning Text into Data

9/6 - Web Scraping

9/8 - APIs

9/13 - No class meeting, professor at Turing Institute

9/15 - Text Parsing / Regular Expressions

9/20 - The People Behind the Text

Unit 2: Introductory Data Science with Text

9/22 - Sentiment Analysis

9/27 - Natural Language Processing (NLP) 101

9/29 - Turning Words into Numbers

10/4 - No class meeting, professor at conference

10/6 - Data and Context

[ FALL BREAK ]

Unit 3: Modeling Text as Data I

October 13 - Topic Models

October 18 - Word Embedding Models

October 20 - Pandas, Papers

October 25 - Guest Lecture, Lucy Li

October 27 - Project Brainstorming

Unit 4: Modeling Text as Data II

November 1 – Classification, day 1

November 3 – No class meeting, professor at conference

November 8 – Classification, day 2

November 10 – Clustering

November 15 - BERT (Bidirectional Encoder Representations from Transformers), day 1

November 17 – BERT, day 2

[ THANKSGIVING BREAK ]

Unit 5: Final Projects and Course Wrap-Up

November 30 – Project Presentations

December 2 – Project Presentations

December 7 – Course wrap-up and assessment

FINAL PROJECTS DUE DECEMBER 8TH, 5:30PM ET

FilesExpand file tree

schedule.md

Latest commit

History

schedule.md

File metadata and controls

QTM 340: Current Class-by-Class Schedule

Introduction and Overview

8/25 - What can you do with text?

8/30 - What should you do with text?

9/1 - Intro to Python, Colab

Unit 1: Turning Text into Data

9/6 - Web Scraping

9/8 - APIs

9/13 - No class meeting, professor at Turing Institute

9/15 - Text Parsing / Regular Expressions

9/20 - The People Behind the Text

Unit 2: Introductory Data Science with Text

9/22 - Sentiment Analysis

9/27 - Natural Language Processing (NLP) 101

9/29 - Turning Words into Numbers

10/4 - No class meeting, professor at conference

10/6 - Data and Context

[ FALL BREAK ]

Unit 3: Modeling Text as Data I

October 13 - Topic Models

October 18 - Word Embedding Models

October 20 - Pandas, Papers

October 25 - Guest Lecture, Lucy Li

October 27 - Project Brainstorming

Unit 4: Modeling Text as Data II

November 1 – Classification, day 1

November 3 – No class meeting, professor at conference

November 8 – Classification, day 2

November 10 – Clustering

November 15 - BERT (Bidirectional Encoder Representations from Transformers), day 1

November 17 – BERT, day 2

[ THANKSGIVING BREAK ]

Unit 5: Final Projects and Course Wrap-Up

November 30 – Project Presentations

December 2 – Project Presentations

December 7 – Course wrap-up and assessment

FINAL PROJECTS DUE DECEMBER 8TH, 5:30PM ET