This toolkit is developed for evaluating language models, and synthetic data generation for the Burmese language. This repository contains a data generation pipeline, and evaluation scripts for various language models on Burmese language tasks.
| Benchmark Name | Questions | Types |
|---|---|---|
| Myanmar G12L Benchmark | 962 | multiple-choice, true-false, short-answer, fill-in-the-blank, long-answer |
| MMLU Lite | 600 | multiple-choice |
| Driving and Riding Theory Test | 200 | multiple-choice |
| Belebele | 900 | multiple-choice |
Coming soon
| Dataset Name | Rows | Tokens |
|---|---|---|
| Wikipedia | 109k | 24B |
| Myanmar Instructions 67k | 67k | - |
| OSCAR-2019-Burmese-fix | 140k | - |
| Myanmar Dictionary | - | 220k |
| Public School Text Books | - | 680k |
| Benchmark Name | Status |
|---|---|
| OSCAR-2301 | Coming Soon |
| myParaphrase | Coming Soon |
| myanmar_news | Coming Soon |
| FLORES-200 | Coming Soon |
| myPOS | Coming Soon |
| BurmeseProverbDataSet | Coming Soon |
| Wattpad | Coming Soon |
- Python 3.9 or higher
- uv
# Clone the repository
git clone https://github.com/Rickaym/aya-my-tk.git
cd aya-my-tk
# Install dependencies
uv syncFor running the evaluation, you can use the following code:
from ayamytk.test.bench import evals
from ayamytk.test.bench.sampler import CustomSampler
def chat(messages):
# Add your inference code here
...
evals.run(samplers={"model": CustomSampler(chat=chat)}, evals="mg12l")You can also run the evaluation for Cohere, OpenAI and OpenRouter supported models as well:
from ayamytk.test.bench.sampler import OpenRouterSampler, ChatCompletionSampler
# Assuming you have "OPENROUTER_API_KEY" in your environment variables
evals.run(samplers={"model": OpenRouterSampler(model="deepseek/deepseek-chat")}, evals="mg12l")
evals.run(samplers={"model": ChatCompletionSampler(model="gpt-4o")}, evals="mmlu_lite")This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.
For any questions or suggestions, please open an issue in the repository.
This project started as part of the Expedition Aya 2025 initiative.