Create documentation for the HF pipelines. 

We have a recently created [huggingface_pipelines](https://github.com/facebookresearch/SONAR/tree/main/huggingface_pipelines) directory with some nice code, but no obvios examples of how to use it. 

One could create a documentation page that explains the purpose of the pipelines and illustrates the code with which they could be applied. 

An example of the task would be to use the FLORES dataset (https://huggingface.co/datasets/facebook/flores) to compare the quality of translation from various languages to one (e.g. to English or to Spanish). 

# Motivation for the task
A typical way to evaluate SONAR models for a particular language would be to encode some dataset of sentences and then to decode it to the same language (reconstruction) or to another language (translation). Then the generated texts get compared with the reference texts using numeric scores such as BLEU (from the `sacrebleu` package). 

We want to use this task as an opportunity of learning more about the pipelines which are kind of glue that connects the models to the data (by e.g. batching the data to feed to the models). 

# How to approach
All or most of the code elements are (probably) already somewhere in the repo, the goal is to put them together with the new Hugginface pipeline using segmentation, encoding, decoding, and BLEU computation.

A good entrypoint might be the tests (e.g. https://github.com/facebookresearch/SONAR/blob/main/tests/unit_tests/huggingface_pipelines/text.py) that illustrate some of potential use cases of the HF pipeline. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create documentation for the HF pipelines. #41

Motivation for the task

How to approach

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Create documentation for the HF pipelines. #41

Description

Motivation for the task

How to approach

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions