Skip to content

A toolkit evaluating large language models, and generating synthetic data for the Burmese language.

Notifications You must be signed in to change notification settings

Rickaym/aya-my-tk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Aya Myanmar Toolkit

License: MIT Python Version

This toolkit is developed for evaluating language models, and synthetic data generation for the Burmese language. This repository contains a data generation pipeline, and evaluation scripts for various language models on Burmese language tasks.

🎯 Benchmarks

Primary Benchmarks

Benchmark Name Questions Types
Myanmar G12L Benchmark 962 multiple-choice, true-false, short-answer, fill-in-the-blank, long-answer
MMLU Lite 600 multiple-choice
Driving and Riding Theory Test 200 multiple-choice
Belebele 900 multiple-choice

📊 Model Performance

Coming soon

🗃️ Training Datasets

Dataset Name Rows Tokens
Wikipedia 109k 24B
Myanmar Instructions 67k 67k -
OSCAR-2019-Burmese-fix 140k -
Myanmar Dictionary - 220k
Public School Text Books - 680k

Additional Datasets

Benchmark Name Status
OSCAR-2301 Coming Soon
myParaphrase Coming Soon
myanmar_news Coming Soon
FLORES-200 Coming Soon
myPOS Coming Soon
BurmeseProverbDataSet Coming Soon
Wattpad Coming Soon

🚀 Getting Started

Prerequisites

  • Python 3.9 or higher
  • uv

Installation

# Clone the repository
git clone https://github.com/Rickaym/aya-my-tk.git
cd aya-my-tk

# Install dependencies
uv sync

Usage

For running the evaluation, you can use the following code:

from ayamytk.test.bench import evals
from ayamytk.test.bench.sampler import CustomSampler

def chat(messages):
    # Add your inference code here
    ...

evals.run(samplers={"model": CustomSampler(chat=chat)}, evals="mg12l")

You can also run the evaluation for Cohere, OpenAI and OpenRouter supported models as well:

from ayamytk.test.bench.sampler import OpenRouterSampler, ChatCompletionSampler

# Assuming you have "OPENROUTER_API_KEY" in your environment variables
evals.run(samplers={"model": OpenRouterSampler(model="deepseek/deepseek-chat")}, evals="mg12l")

evals.run(samplers={"model": ChatCompletionSampler(model="gpt-4o")}, evals="mmlu_lite")

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📧 Contact

For any questions or suggestions, please open an issue in the repository.


This project started as part of the Expedition Aya 2025 initiative.

About

A toolkit evaluating large language models, and generating synthetic data for the Burmese language.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages