GitHub - g8a9/mgente-gap: Code associated with the paper "Mind the Inclusivity Gap: Multilingual Gender-Neutral Translation Evaluation with mGeNTE".

Code associated with the paper: Mind the Inclusivity Gap: Multilingual Gender-Neutral Translation Evaluation with mGeNTE.

Getting Started

To replicate our experiments, we recommend working in isolation in a new python environment. Once a new environment is created, run

pip install -r requirements.txt

Codebase Organization and Use

The codebase will let you run the four main experimental components of the paper. Each script will require minimal changes to adapt to your setup, e.g., correct input/output directories, etc.

Important

We used a SLURM-based HPC to run our experiments. Some bash script and organization require you to be in the same situation or minimal changes to be run on a standard workstation. If anything is not clear, please open an issue on this repository.

1. Translating mGeNTE

Use the script bash/translate_runs.sh to translate mGeNTE across all models, languages, and using the correct configurations. Input parameters for each run are in the file config/translate_runs.sh. This script's logic is based on running one translation run per SLURM job using arrayjobs.

2. Gender Neutral Evaluation of Translations

Once translations are generated, you can assing a neutrality label using the code in src/gnt_eval. Please refer to the README.md in that folder for details.

3. Computing Attributions using AttnLRP

Use the script bash/attribute_attnlrp.sh to compute fine-grained token attributions.

Tip

We have released all the attributions we computed, so you don't have to. You can find them at this link.

4. Analyzing the Attribution Scores

Use the Jupyter Notebook notebooks/analize_attnlrp.ipynb to analyze, aggregate, and postprocess the raw attribute scores computed in the previous step. You may want to run this script to compute an intermediate representation with statistics on which part of the context was most frequently used for each translation example.

Citation

If you use any of the materials related to the paper, please cite:

@misc{savoldi2025mindinclusivitygapmultilingual,
      title={Mind the Inclusivity Gap: Multilingual Gender-Neutral Translation Evaluation with mGeNTE}, 
      author={Beatrice Savoldi and Giuseppe Attanasio and Eleonora Cupin and Eleni Gkovedarou and Janiça Hackenbuchner and Anne Lauscher and Matteo Negri and Andrea Piergentili and Manjinder Thind and Luisa Bentivogli},
      year={2025},
      eprint={2501.09409},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2501.09409}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
bash		bash
config		config
guidelines		guidelines
notebooks		notebooks
src		src
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Getting Started

Codebase Organization and Use

1. Translating mGeNTE

2. Gender Neutral Evaluation of Translations

3. Computing Attributions using AttnLRP

4. Analyzing the Attribution Scores

Citation

About

Uh oh!

Contributors 2

Uh oh!

Languages

License

g8a9/mgente-gap

Folders and files

Latest commit

History

Repository files navigation

Getting Started

Codebase Organization and Use

1. Translating mGeNTE

2. Gender Neutral Evaluation of Translations

3. Computing Attributions using AttnLRP

4. Analyzing the Attribution Scores

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors 2

Uh oh!

Languages