Skip to content

Add a notebook with finetuning example#70

Merged
avidale merged 3 commits intomainfrom
finetuning-classifier-example
Jul 29, 2025
Merged

Add a notebook with finetuning example#70
avidale merged 3 commits intomainfrom
finetuning-classifier-example

Conversation

@avidale
Copy link
Copy Markdown
Contributor

@avidale avidale commented Jul 28, 2025

Why ?

In #59, we got asked how SONAR models could be fine-tuned. So I am adding a notebook showing how a SONAR text encoder could be fine-tuned for a classification task.

How ?

  • Added a notebook for toxicity classifier training (inspired by MuTox)
  • Used an unofficial HF port of the SONAR encoder, to remove the dependency on Fairseq2
  • Implemented a custom SonarForSequenceClassification class mimicking similar BERT-based models in the Transformers package.
  • Used the HF Trainer for fine-tuning
  • Trained two versions: with the encoder frozen and unfrozen (one could also partially unfreeze the top layers)

Test plan

The notebook is self-explanatory and does not need a test.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 28, 2025
@avidale avidale mentioned this pull request Jul 28, 2025
@avidale avidale merged commit 8da24c2 into main Jul 29, 2025
5 checks passed
@avidale avidale deleted the finetuning-classifier-example branch July 29, 2025 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants