- [ ] Create Canonical datasets in Farsi (with similar complexity level and overall format/category to the Turkish version) - [ ] Generate Farsi-specific perturbations for the canonical examples - [ ] Add them the dataset sheet - [ ] Create a HF dataset for the Farsi tokenizer robustness dataset and push to R3 HF space (the associated collection)