Releases: Course-Correct-Labs/featurematch
Releases · Course-Correct-Labs/featurematch
FeatureMatch v0.1.0
FeatureMatch v0.1.0 — Initial Release 🚀
FeatureMatch quantifies correspondence between features learned by two models (e.g., two SAEs) using cosine similarity. Perfect for reproducibility checks and hyperparameter comparison.
✨ Features
- Cosine similarity matrix with memory-efficient blocking
- Top-k matches per feature with scores
- Summary statistics:
mean_best,median_best,pct_above_threshold - Centering to remove baseline activations (focus on relative patterns)
- Simple heatmap visualization
- Strict deterministic tests (rtol=1e-4, atol=1e-6)
- Python 3.9+ support with CI on 3.9, 3.10, 3.11
📦 Installation
pip install "git+https://github.com/Course-Correct-Labs/featurematch.git"🚀 Usage (5 lines)
import torch
from featurematch.featurematch import align_features
Z_a, Z_b = torch.randn(200, 64), torch.randn(200, 64) # same layer & dataset
res = align_features(Z_a, Z_b, topk=5, threshold=0.8)
print(res.stats) # {'mean_best': ..., 'median_best': ..., 'pct_above_threshold': ...}📊 Interpretation Guide
- mean_best ≥ 0.85: strong reproducibility (dictionaries mostly aligned)
- 0.70–0.85: partial alignment; seeds/hparams differ
- < 0.70: low alignment; likely different learned dictionaries
📖 Demo
See notebooks/featurematch_demo.ipynb for examples with:
- Permutation recovery (perfect alignment: mean=1.0, 100% above threshold)
- Random baseline (low alignment: mean≈0.16, 0% above threshold)
🗺️ Roadmap
v0.1 is intentionally minimal. Future versions (demand-driven):
- v0.2: Jaccard for binarized codes, permutation baseline
- v0.3: Hungarian matching, (optional) global CKA
📄 License
MIT © Course Correct Labs