-
Notifications
You must be signed in to change notification settings - Fork 44
Homework 1 old
Jinho D. Choi edited this page Dec 31, 2016
·
1 revision
Your task is to implement a dependency parser reaching to the state-of-the-art accuracy and speed. You are allowed to work in groups of at most 2. Submit your work by Oct. 7th before the class.
- Download the standard dataset.
- Implement a transition-based dependency parser using Nivre's arc-eager algorithm using our architecture.
- Improve your parser in various ways.
- Evaluate the accuracy of your parser (UAS and LAS) for both Malt and Stanford dependency formats.
- Evaluate the speed of your parser (tokens/sec. and sentences/sec.).
- Write a report (4-8) pages in the ACL format. Your report must include abstract, introduction, related work, approach, experiments, and conclusion.
- Commit all your work to your Github repoistory.
- Create a wiki page
Dependency Parsingshowing instructions of how to run your parser. You must provide a pre-trained model that is ready to be run.
1 Ms. ms. NNP _ 2 NMOD 2 nn
2 Haag haag NNP _ 3 SUB 3 nsubj
3 plays play VBZ _ 0 ROOT 0 root
4 Elianti elianti NNP _ 3 OBJ 3 dobj
5 . . . _ 3 P 3 punct
Each column represents:
-
0: ID. -
1: word-form. -
2: lemma (predicted). -
3: POS tag (predicted). -
4: extra features (blank). -
5: head ID from Malt (gold). -
6: dependency label from Malt (gold). -
7: head ID from Stanford (gold). -
8: dependency label from Stanford (gold).
- Your dependency parser must extend
NLPComponent. - No 3rd-party library including any implementation of a dependency parser can be used for this homework.
- Your final model must not be tuned for the evaluation set.
- Your report must clearly state all of your approaches and findings including:
- Machine learning algorithm.
- Parsing algorithm.
- Search strategy.
- Feature set.
- In your wiki-page, indicate which codes are used for your dependency parser.
- You must report the following for at least your baseline and final models.
- UAS and LAS for the Malt dependency format on the development and evaluation sets.
- UAS and LAS for the Stanford dependency format on the development and evaluation sets.
- Use
SpeedTestfor measuring the speed of your parser.
Copyright © 2015-2019 Emory University - All Rights Reserved.
