-
Machine required:
- GPU (>= 10GB) (if no, modify train.py and set
ASTNN(...use_gpu=False...)) - 32GB main memory
- 2GB disk
- GPU (>= 10GB) (if no, modify train.py and set
-
Download dataset from one of links here:
-
Extract data directory to the project root, it should contains
data/train.json,data/valid.jsonanddata/test.json -
Configure environment:
- python 3.7
- pytorch 1.3 (newest)
- pandas
- gensim
- javalang (use pip to install)
- maybe others
-
Run
python ./preprocess_data.pyto preprocess data, some hours needed. -
Run
python ./train.pyto train the model. It will train 50 epoch in default, 2-3 hours pre epoch. Model will be saved per epoch.
Hint: if you want to run xxx.py background and print to a file (as a example, xxx.log), use the following command on Linux:
nohup python -u xxx.py > xxx.log 2>&1 &