This is the official repository for two papers:
- original paper at EMNLP 2023: Calc-X and Calcformers: Empowering Arithmetical Chain-of-Thought through Interaction with Symbolic Systems
- follow-up paper at EMNLP 2024: Self-training Language Models for Arithmetic Reasoning
You can access the datasets and the trained models on HuggingFace:
This repo contains:
- dataset builders
- training, self-training, inference, and evaluation scripts
- inference wrappers for models that can call a function during inference
First, clone the repo. Then run:
conda create -n gadgets python=3.10 && conda activate gadgets
pip install poetry
poetry install This installs all dependencies in exact same versions used by the authors of the repo. In case you encounter any issues on your hardware (e.g., with CUDA version, platform, etc.), you can resolve the dependencies yourself:
# with plain pip:
pip install -e .[dev]
# OR with poetry:
poetry lock && poetry installThe inference script examples/predict_calc.py can load a dataset, generate predictions, and save them to a local jsonl file, such as predictions.jsonl. This can be used for two things:
- evaluating the accuracy of a trained model - you can compute the metrics (and confidence intervals) using
python examples/test_calc.py predictions.jsonl - generating predictions for offline self-training data (experiment 3.1 in our Self-training paper)
The training script for supervised learning is examples/train_calc.py, run with --help to learn about parameters. This script has been used to train the Calcformer models on HuggingFace.
Note that this script can be also used for offline self-training - it can train a model on solutions generated by the model itself. This was used for SFT offline self-training in experiment 3.1 in the Self-training paper. For the preference optimization runs in the same experiment, there is a separate training script examples/train_calc_dpo.py.
Online self-training (with both supervised loss or preference optimization) can be run with python examples/train_calc_dpo.py. For example, our online self-training run with KTO 0.1 (experiment 3.2 in the Self-training paper) can be reproduced with:
python examples/selftrain_calc.py --mode dpo --dpo-loss-type kto_pair --prefs-beta 0.1 --batch-size 1 --grad-accum 32There are plenty of parameters to play around with, run with --help to find more.
We patch the generate() method to be able to enable the given set of tools (functions) during the generation.
You will need to wrap a model of your choice and make sure that the tokenizer is able to encode the instruction
HTML tags that are used in calling the tools.
To run our pretrained models, use the following:
from transformers import T5ForConditionalGeneration, T5Tokenizer
from gadgets.model import gadget_assisted_model
from gadgets.gadget import Calculator
GadgetAssistedT5 = gadget_assisted_model(T5ForConditionalGeneration)
model = GadgetAssistedT5.from_pretrained("MU-NLPC/calcformer-flan-xl")
tokenizer = T5Tokenizer.from_pretrained("MU-NLPC/calcformer-flan-xl")
model.prepare_for_generate(tokenizer,
enabled_gadgets=[Calculator()],
default_max_tokens=512)
query = """
The profit from a business transaction is shared among 2 business partners,
Mike and Johnson in the ratio 2:5 respectively.
If Johnson got $2500, how much will Mike have
after spending some of his share on a shirt that costs $200?
"""
inputs = tokenizer(query, return_tensors="pt")
output_ids = model.generate(**inputs)
tokenizer.decode(output_ids[0], spaces_between_special_tokens=False)
# This returns:
# 'According to the ratio, Mike got 2/5*$2500 = $<gadget id="calculator">2/5*2500</gadget><output>1_000</output> 1000
# Mike will have $1000-$200 = $<gadget id="calculator">1000-200</gadget><output>800</output> 800 after buying a shirt.
# Final result is<result>800</result></s>'If you use a decoder-only model, pass the architecture parameter into model.generate as follows:
output_ids = model.generate(**inputs, architecture='decoder-only')If you find the Calc-X collection or Calcformers useful in your research, please cite the Calc-X and Calcformers paper as follows:
@inproceedings{kadlcik-etal-2023-calcx,
title = "Calc-X and Calcformers: Empowering Arithmetical Chain-of-Thought through Interaction with Symbolic Systems",
author = "Marek Kadlčík and Michal Štefánik and Ondřej Sotolář and Vlastimil Martinek",
booktitle = "Proceedings of the The 2023 Conference on Empirical Methods in Natural Language Processing: Main track",
month = dec,
year = "2023",
address = "Singapore, Singapore",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/2305.15017",
}In case of self-training, please cite our Self-training paper:
@inproceedings{kadlcik-etal-2024-selftraining,
title = "Self-training Language Models for Arithmetic Reasoning",
author = "Marek Kadlčík and Michal Štefánik",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024",
month = dec,
year = "2024",
address = "Miami, Florida",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/2407.08400",
}