-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Problem
We need a reliable way to auto-fetch tier-sliced MatFormer checkpoints (esp. HF repos) and fail cleanly when slices are missing. Current load logic only checks local "-tierN" folders; hub downloads always pull the universal checkpoint.
Proposed solution
- Introduce
matformer_manifest.json(schema v1) withcommon_filesand per-tierfiles(relative to the manifest) plus optional metadata (tier, base/intermediate sizes). - Update
scripts/export_matformer_tiers.pyto write the manifest alongside the universal checkpoint, listing tier file paths + common files (tokenizer, config, etc.). - Add a hub download helper that can fetch only a specific file list (used for manifest + tier files).
- In client init, when
--matformer-load-strategy auto|slicedand tier > 0, attempt to use the manifest to fetch a tier slice; if missing or incomplete, fail cleanly insliced, and fallback to universal inauto. - Ensure config validation uses
matformer_tier+matformer_base_intermediate_sizefrom sliced checkpoints.
Error cases
- Missing manifest when
slicedis requested → error with clear remediation. - Manifest present but tier missing → error (sliced) or fallback (auto).
- Manifest lists files missing from repo/local path → error with missing file list.
Tests
- Unit tests for manifest parsing and tier file selection.
- Tests for missing-tier + sliced behavior.
- Run
cargo test -p psyche-client(andpsyche-modelingif needed).
Notes
- HF repo layout will allow tier files in subdirs; manifest paths are relative to the manifest.
- Keep backward compatibility with local
-tierNdirectories when manifest is absent.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels