Skip to content

Vicomtech/adaptia-mt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

ADAPTIA-MT

ADAPTIA-MT is a suite of resources for the adaptation and evaluation of Machine Translation (MT) systems in industrial domains. The suite includes specialised terminology and parallel validation corpora for Basque-Spanish translation, manually crafted and validated in four sectors: automotive, energy, railways and machine tool.

The corpus was created during the development of project ADAPT-IA (KK-2023/00035), which received funding from the Department of Economic Development and Competitiveness of the Basque Government (Spri Group), within the Elkartek I programme (2023-2024).

Format and usage

The suite includes two main Basque-Spanish datasets, in two separate folders:

  • ADAPTIA-MT-TERM: four termbases in TBX format, one for each industrial domain.
  • ADAPTIA-MT-TEST: professionally translated text from the selected industrial domains, aligned at both document and sentence levels. At the sentence level, the corpus is provided both with and without term annotation.

Authors

The following researchers were involved in the ADAPTIA-MT dataset creation process:

  • Thierry Etchegoyhen (Vicomtech)
  • Harritxu Gete (Vicomtech)
  • Begoña Arrate (UZEI)
  • Joxean Zapirain (UZEI)
  • Victor Ruiz (Vicomtech)

Contact

tetchegoyhen@vicomtech.org

License

ADAPTIA-MT is distributed under the following license:

CC BY-NC-ND 4.0

Other relevant information

If you use this dataset in your work, please cite the following paper (to appear):

@inproceedings{etchegoyhen-et-al2025adaptiamt,
    title = "Machine Translation in Industrial Domains: Resources and Evaluations",
    author = " Etchegoyhen, Thierry  and Gete, Harritxu and Arrate, Begoña and Zapirain, Joxean and Ruiz, Victor",
    booktitle = "Proceedings of SEPLN 2025",
    year = "2025",
    address = "Zaragoza, Spain",
}

About

adaptia-mt

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published