This repository contains materials associated to the paper:
Alan Ramponi, Gaudenzia Genoni, and Sara Tonelli. 2025. ARG2ST at CQs-Gen 2025: Critical Questions Generation through LLMs and Usefulness-based Selection. In Proceedings of the 12th Workshop on Argument Mining (ArgMining 2025), Vienna, Austria. Association for Computational Linguistics. [cite] [paper]
Clone this repository on your own path:
git clone https://github.com/dhfbk/cqs-gen.git
Create an environment with your own preferred package manager. We used python 3.9 and dependencies listed in requirements.txt. If you use conda, you can just run the following commands from the root of the project:
conda create --name cqs-gen python=3.9 # create the environment
conda activate cqs-gen # activate the environment
pip install --user -r requirements.txt # install the required packages
We use the CQs-Gen dataset obtained from the CQs-Gen shared task repository. Data is in the data/ folder:
validation_all.json: the development data split (matching the originalvalidation.jsonfile).validation.json: the development data split (with interventions used in few-shot prompts removed for fair comparison with the zero-shot setting).test.json: the test data split.
The generation phase is conducted by prompting an LLM to obtain a raw output containing $N candidate CQs (i.e., either 3 or 5) for a given argumentative text. First, define the parameters in pred_eval_$N.sh (i.e., the model(s), the prompt(s), the zero/few-shot setting(s), and the seed(s)), then run the following:
sh pred_eval_$N.sh
where $N represents the number of CQs to generate (i.e., either 3 or 5). The outputs will be created in the results/ folder. Specifically, the following files (with prefix corresponding to the defined parameters) will be created:
*.logfile: the LLM's raw output (with logs and associated prompt).*.jsonfile: the postprocessed output in .json format (i.e., with predicted CQs extracted from the raw output and associated with the relevant interventions).*_results-similarity-06.jsonfile: the postprocessed output in .json format (i.e., with predicted CQs extracted from the raw output and associated with the relevant interventions) with labels determined using the official shared task evaluation script.*_results-similarity-06.txtfile: the quantitative results in terms of overall punctuation score as well as label and punctuation distributions.
The CQs selection phase leverages a pretrained model fine-tuned using a dataset of Useful and Not useful CQs (i.e., unhelpful and invalid merged together; see src/machamp/data/ for the data flavors for training and the paper for details on how we assemble them). The fine-tuned model is a binary classifier and provides the confidence score for each predicted label. We use the confidence score for the label Useful as given by the classifier and rank the candidate CQs by decreasing "usefulness". We then select the top-k (k=3) CQs and use them as final output.
To fine-tune the model, first define the parameters in src/scripts/train.sh (i.e., the model(s) and the data setting(s)), then run the following:
sh src/scripts/train.sh
The fine-tuned model will be created at logs/$MODEL_NAME/$DATETIME/model.pt, where $MODEL_NAME is a string corresponding to the defined parameters and $DATETIME is the datetime.
To predict the usefulness of candidate CQs using the fine-tuned model, first convert the .json file $JSON_FILE obtained in the Generation of critical questions to .tsv:
python src/scripts/json-to-tsv.py --input_filepath $JSON_FILE
Then define the parameters in src/scripts/predict.py (i.e., the filepath of the resulting .tsv file(s), the model name(s) and datetime(s) to be used) and run the following:
sh src/scripts/predict.py
You will find the predictions in logs/$MODEL_NAME/$DATETIME/CQfilter.out.
Now run the selector of the top-k (k=3) CQs:
python src/filtering.py --input_filepath $JSON_FILE --strategy model
Finally, run the evaluation script:
sh eval_3.sh
The outputs will be created in the results/ folder.
If you use or build on top of this work, please cite our paper as follows:
@inproceedings{ramponi-etal-2025-arg2st,
title = "ARG2ST at CQs-Gen 2025: Critical Questions Generation through LLMs and Usefulness-based Selection",
author = "Ramponi, Alan and
Genoni, Gaudenzia and
Tonelli, Sara",
booktitle = "Proceedings of the 12th Workshop on Argument Mining (ArgMining 2025)",
month = jul,
year = "2025",
address = "Vienna, Austria",
publisher = "Association for Computational Linguistics"
}