This repository provides an implementation to maximize the surrogate objective function as mentioned in the DeepSeekMath paper. The approach leverages Symbolic Programming and natural language processing (NLP) models to simulate the optimization of the target function efficiently.
Before running the code, ensure that the following dependencies are installed:
- PyTorch: The framework used for deep learning computations.
- BERT Base Cased: A pre-trained transformer model required for tokenization.
Follow the official PyTorch installation guide based on your system configuration: PyTorch Installation
pip install torch torchvision torchaudioTo use the BERT tokenizer, install the transformers library and load the model:
pip install transformersDownload and load the BERT Base Cased model:
from transformers import BertTokenizer
tokenizer = BertTokenizer.from_pretrained("bert-base-cased")Run the main script to execute the optimization process:
python GRPO_Sim.pyIf you use this repository in your work, please cite the corresponding paper:
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models - Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Xiao Bi, Haowei Zhang, Mingchuan Zhang, Y.K. Li, Y. Wu, Daya Guo, 2024. [Link to Paper]
This project is licensed under the MIT License.