ETSP — Multilingual Counterspeech Generation

This repository contains the code for a semester project on Multilingual Counterspeech Generation, conducted as part of a university course and inspired by the Shared Task on Multilingual Counterspeech Generation (MCG-COLING 2025).

The project investigates how large language models fine-tuned with parameter-efficient methods perform across high- and low-resource languages, with a focus on multilingual generalization and training dynamics.

Project Overview

Task: Generate respectful and factual counter-narratives in response to hate speech
Languages: English, Italian, Spanish, Basque
Model: LLaMA-2-7B
Fine-tuning: LoRA + 4-bit quantization (NF4)
Evaluation: BLEU, ROUGE-L, BERTScore

Repository Structure

Main training and evaluation notebook: Counterspeech.py
Knowledge selection: kwnoledge_selection_gpt4o.py
Dataset used for knowledge selection part: train_selected.csv, validation_selected.csv
Project documentation: README.md

Dataset

This project uses the dataset released for the MCG-COLING Shared Task, consisting of:

596 Hate Speech – Counter Narrative pairs per language
5 background knowledge sentences per instance
Splits:
- Train: 396
- Development: 100
- Test: 100

Dataset source:

Hugging Face: LanD-FBK/ML_MTCONAN_KN

Methodology

Model

Base model: meta-llama/Llama-2-7b-hf
Objective: Causal Language Modeling

Fine-Tuning

LoRA rank: 8
LoRA alpha: 64
Dropout: 0.05
Quantization: 4-bit NF4 using bitsandbytes
Frameworks: Hugging Face Transformers, PEFT

Each language is fine-tuned independently using identical hyperparameters to enable fair cross-lingual comparison.

Evaluation

Model performance is evaluated using the following automatic metrics:

BLEU
ROUGE-L
BERTScore

Training dynamics are monitored using TensorBoard to analyze convergence behavior across languages.

How to Run

1. Install dependencies

pip install torch transformers datasets peft bitsandbytes accelerate evaluate bert_score rouge_score

2. Run training and evaluation

Open and run the Python file:

Counterspeech.py

A GPU with sufficient VRAM is strongly recommended.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Counterspeech.py		Counterspeech.py
README.md		README.md
knowledge_selection_gpt4o.py		knowledge_selection_gpt4o.py
test_selected.csv		test_selected.csv
train_selected.csv		train_selected.csv
validation_selected.csv		validation_selected.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ETSP — Multilingual Counterspeech Generation

Project Overview

Repository Structure

Dataset

Methodology

Model

Fine-Tuning

Evaluation

How to Run

1. Install dependencies

2. Run training and evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ETSP — Multilingual Counterspeech Generation

Project Overview

Repository Structure

Dataset

Methodology

Model

Fine-Tuning

Evaluation

How to Run

1. Install dependencies

2. Run training and evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages