Skip to content

ctr4si/sentence-completion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Word RNN for Sentence Completion

A pytorch implementation of the word-level recurrent neural network for sentence completion. The code is based on Word-level language modeling RNN, and importance sampling module is from PyTorch Large-Scale Language Model.

Requirements

  • torchvision >= 0.2.0
  • torch >= 0.3.0.post4
  • numpy >= 1.13.3
  • pandas >= 0.21.0
  • nltk >= 3.2.5
  • tqdm >= 4.19.5
  • Cython >= 0.27.3

pip3 install -r requirements.txt

Setup

  • Build Log_Uniform Sampler according to Link.
  • Download punkt package in nltk.

Datasets

  • Microsoft Research Sentence Completion Challenge - Training and Test dataset can be downloaded from Link. Store the downloaded test data in ./data/completion/.
  • Scholastic Aptitude Test sentence completion questions - Collected questions are provided in ./data/completion/SAT_set_filled.csv.
  • Nineteenth century novels (19C novels) - Extract ./data/prepro/guten.tgz of preprocessed files.
  • One Billion Word Benchmark (1B word) - Link

Run

Training

python3 train.py --cuda --save_dir mynet

Default arguments are set for training with 19C novels. Argument settings for training with the 1B word benchmark are presented in the following table.

Argument 19C novels 1B word
corpus guten gbw
emsize 200 500
nhid 600 2000
outsize 400 500
lr 0.5 1.0
decay_after 5 1
decay_rate 0.5 0.8
batch_size 20 100
nsampled -1 8192

Sentence completion

python3 sent_cmplt.py --cuda --save_dir mynet

Results

corpus bidirec MSR accuracy SAT accuracy
guten False 69.4 (0.8)* 29.6 (1.5)*
guten True 72.3 (1.1)* 33.3 (2.0)*
gbw False 63.2 66.5
gbw True 64.1 69.1

*The mean accuracy of five networks trained with different random initializations is shown with the standard deviation in parentheses.

About

A pytorch implementation of the word-level recurrent neural network for sentence completion

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published