The official repository of the paper, "Reasoning on a Budget: How Teacher Signals Shape Efficiency Frontiers for Small Language Models"
# Interactive search
trovus search
# Example searches:
# - "microsoft deberta"
# - "qwen 270m base"
# - "phi mini instruct"# Download with research mode (recommended)
trovus download microsoft/deberta-v3-small --output-dir ./models --research-mode
# Download minimal files only
# Custom file patterns
trovus download microsoft/phi-3-mini --include "*.safetensors" "*.json" --exclude "*.bin"# Parse a model card for download information
trovus parse-card ./model_cards/microsoft_deberta-v3-small_card.md
# The search interface can save model cards automatically# List downloaded models
trovus list
# Show cache information
trovus cache-info
# Get detailed model info
trovus info microsoft/deberta-v3-small
# Remove models
trovus remove microsoft/deberta-v3-smalltrovus search- Interactive model search with fuzzy matchingtrovus parse-card <file>- Extract information from model cards
trovus download <model>- Download models with flexible options--output-dir- Custom download directory--research-mode- Optimized for research (all weights + configs, exclude specialized formats)--minimal- Essential files only (configs + model weights: safetensors/bin/h5/model)--include- File patterns to include--exclude- File patterns to exclude--force- Force re-download
trovus list- List cached models with sizes and datestrovus info <model>- Detailed information about a specific modeltrovus cache-info- Overall cache statisticstrovus remove <model>- Remove models from cache
trovus evaluate <model> --method <sft|cot-d|rl>- Run teacher-signal evaluation flows- SFT (implemented): launches supervised fine-tuning with LoRA/TRL on a registered dataset
- CoT-D / RL (stubs): records config and prepares output directories for upcoming pipelines
- Key flags:
--dataset(defaultgsm8k)--epochs,--learning-rate,--per-device-train-batch-size,--gradient-accumulation-steps--use-4bitfor 4-bit quantization,--lora-rank,--target-modules--output-dirfor run artifacts,--cache-dirfor HF cache overrides
Downloads all model weights and configurations while excluding specialized formats:
- ✅ Includes:
*.safetensors,*.bin,*.h5,*.model,*.json,*.txt,*.py,*.md - ❌ Excludes:
*.msgpack,*.onnx,*.tflite,*.gguf, framework-specific duplicates
Downloads only essential files needed to use the model:
- ✅ Includes: Config files + all model weight formats (
*.safetensors,*.bin,*.h5,*.model) - ❌ Excludes: Less common formats (
*.msgpack,*.onnx,*.tflite)
# Search and download workflow
trovus search
# Type: "microsoft deberta"
# Save model card when prompted
trovus parse-card ./model_cards/microsoft_deberta-v3-small_card.md
trovus download microsoft/deberta-v3-small --output-dir ./models --research-mode
# Quick download for inference
trovus download Qwen/Qwen3-0.6B --minimal --output-dir ./models
# Fine-tune a locally cached model with SFT (LoRA defaults)
trovus evaluate Qwen/Qwen3-0.6B --method sft --dataset gsm8k --epochs 10
# Custom download with specific files
trovus download microsoft/phi-3-mini \
--include "*.safetensors" "config.json" "tokenizer*" \
--output-dir ./models
# Check what you have downloaded
trovus list
trovus cache-infoTo-do:
- Implement the search retriever + model card downloader.
- Convert "python -m trovus" into a universal command: "using trovus"
- Include ability to download model weights.
- Prepare the dataset for question-answer geneation + pairs.
- Finalize fine-tuning techniques and publish dataset onto HF Hub.
- Implement the fine-tuning metrics which will be the foundation for the efficiency frontiers.
- Have the efficiency frontiers be saved in a reproducible and visualizable format.
- Start with finetuning the easiest technique onto the smallest model for starters.