-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Problem
--matformer-load-strategy auto only checks a local -tier{N} path. If the checkpoint is remote (HF), it downloads the universal repo and does not attempt a tiered repo. Small-tier nodes can OOM unless operators manually provide tiered checkpoints.
Refs:
shared/client/src/state/init.rs(local-only tier path detection)
Expected
Auto strategy should prefer tiered checkpoints when available (local or remote), or offer an explicit override to point to tiered repos.
Possible Approach
- Add
--matformer-tier-repoor template (e.g.,{repo}-tier{tier}) and try that before universal. - For auto, attempt remote
repo_id-tier{tier}if local tier dir is missing. - Optional fallback: download universal and slice locally (with clear warning), if enabled.
Acceptance Criteria
- Tiered checkpoints are automatically preferred when available.
- Clear behavior when tiered repos are missing (fallback or error).
- Documentation update for recommended tiered repo layout.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels