You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With French, Italian and Portuguese language support now merged and more languages coming, we need to think about how i18n applies to the tool registry.
The UI and system prompt already get translated. I think at a minimum the tool manifest (metadata) should be translated to provide i18n on the UI for tool detail modals.
My question is about tool definitions (name, description, parameter descriptions). Do they stay in English or get translated. If they stay in English the model is code-switching between the user's language and English tool context every turn.
What I've tested so far: I ran a weather tool (English definition) with Portuguese prompts and it worked. The model routed correctly, extracted parameters and responded in Portuguese. So at the execution level, it seems fine - at least for simple cases. Maybe this breaks down with more complex tools. More testing is needed, for example with Twitter and YouTube tools that are more complex and require tool chaining. I could test this with some help from Claude providing me with translated prompts to CAAL or if a native speaker of any of these languages could test, that would be better.
Open questions that I’ve been asking myself:
Tool prompts sent to the model - The system prompt is in Portuguese but tool definitions are in English. Does this affect routing accuracy on smaller models? At 8B, every bit of clarity matters. A frontier model handles code-switching easily, a small model might not.
How to implement - If we do translate, we could likely use Claude to auto translate the tool-registry in bulk and then translate on each tool submission. Then when new languages are added, would have to do it in bulk again.
Fine tuning caal-ministral - The model would likely have to be fine-tuned in different languages for best results. Ministral is already multi-lingual, but the fine-tuning is only in English. Fine-tune a single multi-lingual variant or create fine-tuned models per language?
Would love to hear from anyone testing CAAL in non-English languages. Where does it work? Where does it break down?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
With French, Italian and Portuguese language support now merged and more languages coming, we need to think about how i18n applies to the tool registry.
The UI and system prompt already get translated. I think at a minimum the tool manifest (metadata) should be translated to provide i18n on the UI for tool detail modals.
My question is about tool definitions (name, description, parameter descriptions). Do they stay in English or get translated. If they stay in English the model is code-switching between the user's language and English tool context every turn.
What I've tested so far: I ran a weather tool (English definition) with Portuguese prompts and it worked. The model routed correctly, extracted parameters and responded in Portuguese. So at the execution level, it seems fine - at least for simple cases. Maybe this breaks down with more complex tools. More testing is needed, for example with Twitter and YouTube tools that are more complex and require tool chaining. I could test this with some help from Claude providing me with translated prompts to CAAL or if a native speaker of any of these languages could test, that would be better.
Open questions that I’ve been asking myself:
Tool prompts sent to the model - The system prompt is in Portuguese but tool definitions are in English. Does this affect routing accuracy on smaller models? At 8B, every bit of clarity matters. A frontier model handles code-switching easily, a small model might not.
How to implement - If we do translate, we could likely use Claude to auto translate the tool-registry in bulk and then translate on each tool submission. Then when new languages are added, would have to do it in bulk again.
Fine tuning caal-ministral - The model would likely have to be fine-tuned in different languages for best results. Ministral is already multi-lingual, but the fine-tuning is only in English. Fine-tune a single multi-lingual variant or create fine-tuned models per language?
Would love to hear from anyone testing CAAL in non-English languages. Where does it work? Where does it break down?
Beta Was this translation helpful? Give feedback.
All reactions