pNLP-Mixer - Unofficial PyTorch Implementation

pNLP-Mixer: an Efficient all-MLP Architecture for Language

Implementation of pNLP-Mixer in PyTorch and PyTorch Lightning.

pNLP-Mixer is the first successful application of the MLP-Mixer architecture in NLP. With a novel embedding-free projection layer, pNLP-Mixer shows performance comparable to transformer-based models (e.g. mBERT, RoBERTa) with significantly smaller parameter count and no expensive pretraining procedures.

Fig. 1 of pNLP-Mixer: an Efficient all-MLP Architecture for Language

Requirements

Python >= 3.6.10
PyTorch >= 1.8.0
PyTorch Lightning >= 1.4.3
All other requirements are listed in the requirements.txt file.

Configurations

Please check configuration examples and also comments in the cfg directory.

Commands

Caching Vocab Hashes

python projection.py -v VOCAB_FILE -c CFG_PATH -g NGRAM_SIZE -o OUTPUT_FILE

VOCAB_FILE: path to the vocab file that contains
CFG_PATH: path to the configurations file
NGRAM_SIZE: size of n-grams used during hashing
OUTPUT_FILE: path where the resulting .npy file will be stored

Training / Testing

python run.py -c CFG_PATH -n MODEL_NAME -m MODE -p CKPT_PATH

CFG_PATH: path to the configurations file
MODEL_NAME: model name to be used for pytorch lightning logging
MODE: train or test (default: train)
CKPT_PATH: (optional) checkpoint path to resume training from or to use for testing

Results

The checkpoints used for evaluation are available here.

MTOP

Model Size	Reported	Ours
pNLP-Mixer X-Small	76.9%	79.3%
pNLP-Mixer Base	80.8%	79.4%
pNLP-Mixer X-Large	82.3%	82.1%

MultiATIS

Model Size	Reported	Ours
pNLP-Mixer X-Small	90.0%	91.3%
pNLP-Mixer Base	92.1%	92.8%
pNLP-Mixer X-Large	91.3%	92.9%

* Note that the paper reports the performance on the MultiATIS dataset using a 8-bit quantized model, whereas our performance was measured using a 32-bit float model.

IMDB

Model Size	Reported	Ours
pNLP-Mixer X-Small	81.9%	81.5%
pNLP-Mixer Base	78.6%	82.2%
pNLP-Mixer X-Large	82.9%	82.9%

Paper

@article{fusco2022pnlp,
  title={pNLP-Mixer: an Efficient all-MLP Architecture for Language},
  author={Fusco, Francesco and Pascual, Damian and Staar, Peter},
  journal={arXiv preprint arXiv:2202.04350},
  year={2022}
}

Contributors

Tony Woo @ MINDsLab Inc. (shwoo@mindslab.ai)

Special thanks to:

Hyoung-Kyu Song @ MINDsLab Inc.
Kang-wook Kim @ MINDsLab Inc.

TODO

8-bit quantization

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
cfg		cfg
figures		figures
labels		labels
wordpiece		wordpiece
LICENSE		LICENSE
README.md		README.md
classification.py		classification.py
dataset.py		dataset.py
mixer.py		mixer.py
model.py		model.py
projection.py		projection.py
requirements.txt		requirements.txt
run.py		run.py
sentencepiece_extractor.py		sentencepiece_extractor.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pNLP-Mixer - Unofficial PyTorch Implementation

Requirements

Configurations

Commands

Caching Vocab Hashes

Training / Testing

Results

MTOP

MultiATIS

IMDB

Paper

Contributors

TODO

About

Uh oh!

Releases

Packages

Languages

License

akontra/pnlp-mixer

Folders and files

Latest commit

History

Repository files navigation

pNLP-Mixer - Unofficial PyTorch Implementation

Requirements

Configurations

Commands

Caching Vocab Hashes

Training / Testing

Results

MTOP

MultiATIS

IMDB

Paper

Contributors

TODO

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages