Calc-X, Calcformers, and Self-training

This is the official repository for two papers:

original paper at EMNLP 2023: Calc-X and Calcformers: Empowering Arithmetical Chain-of-Thought through Interaction with Symbolic Systems
follow-up paper at EMNLP 2024: Self-training Language Models for Arithmetic Reasoning

You can access the datasets and the trained models on HuggingFace:

This repo contains:

dataset builders
training, self-training, inference, and evaluation scripts
inference wrappers for models that can call a function during inference

Creating the environment

First, clone the repo. Then run:

conda create -n gadgets python=3.10 && conda activate gadgets
pip install poetry
poetry install

This installs all dependencies in exact same versions used by the authors of the repo. In case you encounter any issues on your hardware (e.g., with CUDA version, platform, etc.), you can resolve the dependencies yourself:

# with plain pip:
pip install -e .[dev]
# OR with poetry:
poetry lock && poetry install

Inference and evaluation script

The inference script examples/predict_calc.py can load a dataset, generate predictions, and save them to a local jsonl file, such as predictions.jsonl. This can be used for two things:

evaluating the accuracy of a trained model - you can compute the metrics (and confidence intervals) using python examples/test_calc.py predictions.jsonl
generating predictions for offline self-training data (experiment 3.1 in our Self-training paper)

Training scripts

The training script for supervised learning is examples/train_calc.py, run with --help to learn about parameters. This script has been used to train the Calcformer models on HuggingFace.

Note that this script can be also used for offline self-training - it can train a model on solutions generated by the model itself. This was used for SFT offline self-training in experiment 3.1 in the Self-training paper. For the preference optimization runs in the same experiment, there is a separate training script examples/train_calc_dpo.py.

Online self-training script

Online self-training (with both supervised loss or preference optimization) can be run with python examples/train_calc_dpo.py. For example, our online self-training run with KTO 0.1 (experiment 3.2 in the Self-training paper) can be reproduced with:

python examples/selftrain_calc.py --mode dpo --dpo-loss-type kto_pair --prefs-beta 0.1 --batch-size 1 --grad-accum 32

There are plenty of parameters to play around with, run with --help to find more.

Using trained models in code

We patch the generate() method to be able to enable the given set of tools (functions) during the generation. You will need to wrap a model of your choice and make sure that the tokenizer is able to encode the instruction HTML tags that are used in calling the tools.

To run our pretrained models, use the following:

from transformers import T5ForConditionalGeneration, T5Tokenizer

from gadgets.model import gadget_assisted_model
from gadgets.gadget import Calculator

GadgetAssistedT5 = gadget_assisted_model(T5ForConditionalGeneration)

model = GadgetAssistedT5.from_pretrained("MU-NLPC/calcformer-flan-xl")
tokenizer = T5Tokenizer.from_pretrained("MU-NLPC/calcformer-flan-xl")

model.prepare_for_generate(tokenizer, 
                           enabled_gadgets=[Calculator()], 
                           default_max_tokens=512)
query = """
    The profit from a business transaction is shared among 2 business partners, 
    Mike and Johnson in the ratio 2:5 respectively. 
    If Johnson got $2500, how much will Mike have 
    after spending some of his share on a shirt that costs $200?
"""

inputs = tokenizer(query, return_tensors="pt")
output_ids = model.generate(**inputs)
tokenizer.decode(output_ids[0], spaces_between_special_tokens=False)

# This returns:
# 'According to the ratio, Mike got 2/5*$2500 = $<gadget id="calculator">2/5*2500</gadget><output>1_000</output> 1000 
#  Mike will have $1000-$200 = $<gadget id="calculator">1000-200</gadget><output>800</output> 800 after buying a shirt. 
#  Final result is<result>800</result></s>'

If you use a decoder-only model, pass the architecture parameter into model.generate as follows:

output_ids = model.generate(**inputs, architecture='decoder-only')

Cite

If you find the Calc-X collection or Calcformers useful in your research, please cite the Calc-X and Calcformers paper as follows:

@inproceedings{kadlcik-etal-2023-calcx,
    title = "Calc-X and Calcformers: Empowering Arithmetical Chain-of-Thought through Interaction with Symbolic Systems",
    author = "Marek Kadlčík and Michal Štefánik and Ondřej Sotolář and Vlastimil Martinek",
    booktitle = "Proceedings of the The 2023 Conference on Empirical Methods in Natural Language Processing: Main track",
    month = dec,
    year = "2023",
    address = "Singapore, Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/2305.15017",
}

In case of self-training, please cite our Self-training paper:

@inproceedings{kadlcik-etal-2024-selftraining,
    title = "Self-training Language Models for Arithmetic Reasoning",
    author = "Marek Kadlčík and Michal Štefánik",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024",
    month = dec,
    year = "2024",
    address = "Miami, Florida",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/2407.08400",
}

Name		Name	Last commit message	Last commit date
Latest commit History 288 Commits
.vscode		.vscode
data		data
examples		examples
gadgets		gadgets
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Calc-X, Calcformers, and Self-training

Creating the environment

Inference and evaluation script

Training scripts

Online self-training script

Using trained models in code

Cite

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 6

Uh oh!

Languages

License

prompteus/calc-x

Folders and files

Latest commit

History

Repository files navigation

Calc-X, Calcformers, and Self-training

Creating the environment

Inference and evaluation script

Training scripts

Online self-training script

Using trained models in code

Cite

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 6

Uh oh!

Languages

Packages