Skip to content

Add MatFormer manifest for tier-sliced checkpoints with auto-fetch and strict validation #10

@plugyawn

Description

@plugyawn

Problem

We need a reliable way to auto-fetch tier-sliced MatFormer checkpoints (esp. HF repos) and fail cleanly when slices are missing. Current load logic only checks local "-tierN" folders; hub downloads always pull the universal checkpoint.

Proposed solution

  • Introduce matformer_manifest.json (schema v1) with common_files and per-tier files (relative to the manifest) plus optional metadata (tier, base/intermediate sizes).
  • Update scripts/export_matformer_tiers.py to write the manifest alongside the universal checkpoint, listing tier file paths + common files (tokenizer, config, etc.).
  • Add a hub download helper that can fetch only a specific file list (used for manifest + tier files).
  • In client init, when --matformer-load-strategy auto|sliced and tier > 0, attempt to use the manifest to fetch a tier slice; if missing or incomplete, fail cleanly in sliced, and fallback to universal in auto.
  • Ensure config validation uses matformer_tier + matformer_base_intermediate_size from sliced checkpoints.

Error cases

  • Missing manifest when sliced is requested → error with clear remediation.
  • Manifest present but tier missing → error (sliced) or fallback (auto).
  • Manifest lists files missing from repo/local path → error with missing file list.

Tests

  • Unit tests for manifest parsing and tier file selection.
  • Tests for missing-tier + sliced behavior.
  • Run cargo test -p psyche-client (and psyche-modeling if needed).

Notes

  • HF repo layout will allow tier files in subdirs; manifest paths are relative to the manifest.
  • Keep backward compatibility with local -tierN directories when manifest is absent.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions