feat(gemma): Add TranslateGemma support and reorganize Gemma module structure #3325
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Adds support for Google's TranslateGemma translation models (55 languages) and consolidates the Gemma model family into a unified module structure.
Reorganization
gemma.rs → gemma/gemma1.rs
Consolidated gemma2.rs, gemma3.rs, quantized_gemma3.rs under gemma/
Added gemma/mod.rs with re-exports for backward compatibility
TranslateGemma Support
Added gemma/translate_gemma.rs with prompt formatting utilities and ISO 639-1 language codes
Added examples/translate-gemma.rs supporting both full precision and quantized inference
Bug Fixes
gemma3.rs: Make KV tensors contiguous before cache append (fixes "slice-set only supports contiguous tensors" error with certain GQA ratios)
quantized_gemma3.rs: Added clear_kv_cache() method for multi-turn inference
Usage Notes
Full precision models auto-download from HuggingFace
Quantized inference requires a local GGUF file via --model-path (no official Google GGUF conversions; community conversions available on HuggingFace)
Known Issue
Investigation shows gemma3.rs uses GELU while quantized_gemma3.rs uses SiLU. This is a gemma3.rs issue, not specific to TranslateGemma.