Skip to content

Code for the paper "Instituto de Telecomunicações at IWSLT 2025: Aligning Small-Scale Speech and Language Models for Speech-to-Text Learning"

License

Notifications You must be signed in to change notification settings

deep-spin/it-iwslt-2025

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ACL Anthology

Code for the paper: Instituto de Telecomunicações at IWSLT 2025: Aligning Small-Scale Speech and Language Models for Speech-to-Text Learning.

Note

We are working to release an improved training and inference codebase. In this repository, you will only find the model implementation and training code and configs for our IWSLT 2025 submission.

Project Structure

The following list is an overall description of the main folders and scripts in the repository. We are not releasing scripts to download and prepare datasets locally. Feel free to reach out if you want to replicate our exact setup.

  • bash/: contains a bash script to schedule a training run using SLURM.
  • config/: the bash runner requires two yaml configuration files, one to control the distributed training using Hugging Face's accelerate, one for the training parameters. These files can be found here.
  • src/: contains the training utilities and scripts, as well as the modeling code (folder: speechlm). The model is implemented using Hugging Face transformers.

Point of Contact

For inquiries, feel free to open an issue on this repository.

Citation

@inproceedings{attanasio-etal-2025-instituto,
    title = "Instituto de Telecomunica{\c{c}}{\~o}es at {IWSLT} 2025: Aligning Small-Scale Speech and Language Models for Speech-to-Text Learning",
    author = "Attanasio, Giuseppe  and
      Sannigrahi, Sonal  and
      Peters, Ben  and
      Filipe Torres Martins, Andr{\'e}",
    editor = "Salesky, Elizabeth  and
      Federico, Marcello  and
      Anastasopoulos, Antonis",
    booktitle = "Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025)",
    month = jul,
    year = "2025",
    address = "Vienna, Austria (in-person and online)",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.iwslt-1.36/",
    doi = "10.18653/v1/2025.iwslt-1.36",
    pages = "347--353",
    ISBN = "979-8-89176-272-5"
}

Acknowledgments

This work was supported by the Portuguese Recovery and Resilience Plan through project C645008882-00000055 (Center for Responsible AI), by EU’s Horizon Europe Research and Innovation Actions (UTTER, contract 101070631), by the project DECOLLAGE (ERC-2022-CoG 101088763), and by FCT/MECI through national funds and EU funds under UID/50008: Instituto de Telecomunicações.

About

Code for the paper "Instituto de Telecomunicações at IWSLT 2025: Aligning Small-Scale Speech and Language Models for Speech-to-Text Learning"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published