Skip to content

This dataset has been made publicly available as part of the Dalhousie Natural Language Processing Lab (DNLP) research, which focuses on incorporating Structural Embedding of Constituency Trees in the Attention-Based Model for Machine Comprehension.

Notifications You must be signed in to change notification settings

mayankanand111/TreeSQuAD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

TreeSQuAD2.0 Dataset

Welcome to the TreeSQuAD2.0 dataset, a public resource created by the Dalhousie Natural Language Processing Lab (DNLP). This dataset is the result of my master's thesis research, focusing on incorporating Structural Embedding of Constituency Trees in the Attention-Based Model for Machine Comprehension. The thesis can be accessed here.

Contents in the 'Processed' Folder

The 'Processed' folder contains the following:

  1. Parsed Trees:

  2. Simplified Trees:

  3. Vocabulary:

    • Vocabulary of Tokens.

Usage

Feel free to explore and utilize the dataset for your NLP and machine comprehension projects. If you find this resource helpful, consider citing this work or providing feedback.

Acknowledgments

I am sincerely grateful to Dr. Vlado Keselj for his invaluable guidance and support throughout this research.

Happy coding!

About

This dataset has been made publicly available as part of the Dalhousie Natural Language Processing Lab (DNLP) research, which focuses on incorporating Structural Embedding of Constituency Trees in the Attention-Based Model for Machine Comprehension.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages