Skip to content

Repository for EMNLP papers Calc-X, Calcformers and Self-training LMs for Arithmetic Reasoning

License

Notifications You must be signed in to change notification settings

prompteus/calc-x

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Calc-X, Calcformers, and Self-training

This is the official repository for two papers:

You can access the datasets and the trained models on HuggingFace:

This repo contains:

  • dataset builders
  • training, self-training, inference, and evaluation scripts
  • inference wrappers for models that can call a function during inference

Creating the environment

First, clone the repo. Then run:

conda create -n gadgets python=3.10 && conda activate gadgets
pip install poetry
poetry install 

This installs all dependencies in exact same versions used by the authors of the repo. In case you encounter any issues on your hardware (e.g., with CUDA version, platform, etc.), you can resolve the dependencies yourself:

# with plain pip:
pip install -e .[dev]
# OR with poetry:
poetry lock && poetry install

Inference and evaluation script

The inference script examples/predict_calc.py can load a dataset, generate predictions, and save them to a local jsonl file, such as predictions.jsonl. This can be used for two things:

  1. evaluating the accuracy of a trained model - you can compute the metrics (and confidence intervals) using python examples/test_calc.py predictions.jsonl
  2. generating predictions for offline self-training data (experiment 3.1 in our Self-training paper)

Training scripts

The training script for supervised learning is examples/train_calc.py, run with --help to learn about parameters. This script has been used to train the Calcformer models on HuggingFace.

Note that this script can be also used for offline self-training - it can train a model on solutions generated by the model itself. This was used for SFT offline self-training in experiment 3.1 in the Self-training paper. For the preference optimization runs in the same experiment, there is a separate training script examples/train_calc_dpo.py.

Online self-training script

Online self-training (with both supervised loss or preference optimization) can be run with python examples/train_calc_dpo.py. For example, our online self-training run with KTO 0.1 (experiment 3.2 in the Self-training paper) can be reproduced with:

python examples/selftrain_calc.py --mode dpo --dpo-loss-type kto_pair --prefs-beta 0.1 --batch-size 1 --grad-accum 32

There are plenty of parameters to play around with, run with --help to find more.

Using trained models in code

We patch the generate() method to be able to enable the given set of tools (functions) during the generation. You will need to wrap a model of your choice and make sure that the tokenizer is able to encode the instruction HTML tags that are used in calling the tools.

To run our pretrained models, use the following:

from transformers import T5ForConditionalGeneration, T5Tokenizer

from gadgets.model import gadget_assisted_model
from gadgets.gadget import Calculator

GadgetAssistedT5 = gadget_assisted_model(T5ForConditionalGeneration)

model = GadgetAssistedT5.from_pretrained("MU-NLPC/calcformer-flan-xl")
tokenizer = T5Tokenizer.from_pretrained("MU-NLPC/calcformer-flan-xl")

model.prepare_for_generate(tokenizer, 
                           enabled_gadgets=[Calculator()], 
                           default_max_tokens=512)
query = """
    The profit from a business transaction is shared among 2 business partners, 
    Mike and Johnson in the ratio 2:5 respectively. 
    If Johnson got $2500, how much will Mike have 
    after spending some of his share on a shirt that costs $200?
"""

inputs = tokenizer(query, return_tensors="pt")
output_ids = model.generate(**inputs)
tokenizer.decode(output_ids[0], spaces_between_special_tokens=False)

# This returns:
# 'According to the ratio, Mike got 2/5*$2500 = $<gadget id="calculator">2/5*2500</gadget><output>1_000</output> 1000 
#  Mike will have $1000-$200 = $<gadget id="calculator">1000-200</gadget><output>800</output> 800 after buying a shirt. 
#  Final result is<result>800</result></s>'

If you use a decoder-only model, pass the architecture parameter into model.generate as follows:

output_ids = model.generate(**inputs, architecture='decoder-only')

Cite

If you find the Calc-X collection or Calcformers useful in your research, please cite the Calc-X and Calcformers paper as follows:

@inproceedings{kadlcik-etal-2023-calcx,
    title = "Calc-X and Calcformers: Empowering Arithmetical Chain-of-Thought through Interaction with Symbolic Systems",
    author = "Marek Kadlčík and Michal Štefánik and Ondřej Sotolář and Vlastimil Martinek",
    booktitle = "Proceedings of the The 2023 Conference on Empirical Methods in Natural Language Processing: Main track",
    month = dec,
    year = "2023",
    address = "Singapore, Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/2305.15017",
}

In case of self-training, please cite our Self-training paper:

@inproceedings{kadlcik-etal-2024-selftraining,
    title = "Self-training Language Models for Arithmetic Reasoning",
    author = "Marek Kadlčík and Michal Štefánik",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024",
    month = dec,
    year = "2024",
    address = "Miami, Florida",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/2407.08400",
}

About

Repository for EMNLP papers Calc-X, Calcformers and Self-training LMs for Arithmetic Reasoning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 6