Skip to content
Jinho D. Choi edited this page Mar 14, 2017 · 14 revisions

Sentiment Analysis

Your task is to develop a statistical model that takes a document and classifies it into one of the 5 sentiments: 0 - very negative, 1 - negative, 2 - neutral, 3 - positive, 4 - very positive.

  • Download the following files: sst.trn.tsv, sst.dev.tsv, sst.tst.tsv.

  • Each line in the file represents a document, where the format is as follows:

    line ::= <label><tab><document>
    document ::= <token>(<space><token>)*
    
  • Convert each document into a vector with your favorite method.

  • Implement the perceptron and train statistical models using different learning rates. Use the training set for training and the development set for validation.

  • Run the most optimized model on the evaluation set and print the predicted output. Save the output to hw3.out where each line represents the predicted label for each document.

  • Improve your model using different learning algorithms or lexicons if you can. Every submission will be ranked and the ranking score will be reflected to your grade.

  • Write a report describing your approach, results, and analysis. Use the ACL latex template.

Submission

CS571: Natural Language Processing

Instructor


Emory University

Clone this wiki locally