See project report
This repository contains the code and resources for the project "Evaluating the Robustness of rStar: A Novel Framework for Enhanced Reasoning in Small Language Models", conducted as part of the 02456 Deep Learning course at DTU Compute, Fall 2024. In the framework reinforcement strategies are applied where first a target SLM augments the Monte Carlo Tree Search (MCTS) with a rich set of human-like reasoning actions to construct higher quality reasoning trajectories. Next, another SLM (discriminator), with capabilities similar to the target SLM verifies each trajectory generated by the target SLM. The mutually agreed reasoning trajectories are considered mutual consistent, thus are more likely to be correct.
- Jone Steinhoff (s243867)
- Lukas Rasocha (s233498)
- Panagiota Emmanouilidi (s223531)
- Petr B. Nylander (s240466)
- Robert Spralja (s243658)
Prof. Ole Winther DTU Compute, Technical University of Denmark
This project focuses on evaluating the robustness of the rStar framework, a reasoning system for small language models (SLMs). The evaluation uses variations of the GSM8K dataset, highlighting the framework’s strengths and limitations in handling diverse input modifications.
For more information please refer to our report.
Prerequisites
- Python 3.10 or later
- CUDA-enabled GPU (e.g., NVIDIA Tesla A100)
- Libraries listed in
requirements.txt
- Clone the repository:
git clone https://github.com/lukyrasocha/rStar.git
cd rStar
- Install dependencies:
pip install -r requirements.txt
Run and inspect the main.ipynb notebook to see how results can be reproduced.