A LLM-based Recommender System with user&item Tokenizers and a generative retrieval paradigm. The overall framework of the proposed TokenRec, which consists of the masked vector-quantized tokenizer with a K-way encoder for item ID tokenization and the generative retrieval paradigm for recommendation generation. Our paper is available at arXiv-TokenRec.
Please download the checkpoints at Google Drive, and put them in the path of "checkpoints/".
- Go to the path of "code"
python cd code
- Whole Pipeline
python main.py --dataset=LastFM --vq --train_vq --vq_model=MQ --n_token=256 --n_book=3
- Train from checkpoint (LLM)
python main.py --dataset=LastFM --n_token=256 --n_book=3 --train_from_checkpoint
- Evaluation
python main.py --dataset=LastFM --no_train
If this project is helpful to your research, please cite our papers:
Qu, Haohao, Wenqi Fan, Zihuai Zhao, and Qing Li. "Tokenrec: learning to tokenize id for llm-based generative recommendation." arXiv preprint arXiv:2406.10450 (2024).
@article{qu2024tokenrec,
title={Tokenrec: learning to tokenize id for llm-based generative recommendation},
author={Qu, Haohao and Fan, Wenqi and Zhao, Zihuai and Li, Qing},
journal={arXiv preprint arXiv:2406.10450},
year={2024}
}