LLM Translation Evaluation

This repository contains a benchmarking script for evaluating different machine translation models using BLEU score. The script uses the sacrebleu library to download test datasets, generate translations using your choice of models, and calculates BLEU scores to compare the performance of the models on the translation task.

All you need are model deployments in Gemini, Azure OpenAI, and/or OpenAI APIs.

How to Use

Clone the repository:

git clone https://github.com/diondrapeck/machine-translation-eval.git
cd machine-translation-eval

Create and activate a virtual environment:

python3 -m venv venv
source venv/bin/activate  # unix
venv\Scripts\Activate.ps1 # Powershell

Install the required packages:
```
pip install -r requirements.txt
```
Set up your environment variables by creating a .env file in the root directory of the project and adding your API keys. You only need to supply variables for the service providers your models are deployed on:
```
OPENAI_API_KEY = <>
AZURE_OPENAI_ENDPOINT = <>
AZURE_OPENAI_API_KEY = <>
OPENAI_API_VERSION = <>
GEMINI_API_KEY = <>
```
Update the candidate_models.json with your model deployments you want to test the translation on.
Run the benchmarking script with the desired arguments:

python model-benchmark.py --source_language en --target_language fr --chunk_size 100

Arguments
--target-language: 2 character ISO code of the target language to translate into.
--source-language: 2 character ISO code of the source language to translate from. Defaults to "en", for English
--chunk_size: The size of the text sample from the dataset to use for translation task. Defaults to 500 lines.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
candidate_models.json		candidate_models.json
model-benchmark.py		model-benchmark.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Translation Evaluation

How to Use

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM Translation Evaluation

How to Use

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages