Skip to content

Releases: Course-Correct-Labs/featurematch

FeatureMatch v0.1.0

21 Oct 22:34

Choose a tag to compare

FeatureMatch v0.1.0 — Initial Release 🚀

FeatureMatch quantifies correspondence between features learned by two models (e.g., two SAEs) using cosine similarity. Perfect for reproducibility checks and hyperparameter comparison.

✨ Features

  • Cosine similarity matrix with memory-efficient blocking
  • Top-k matches per feature with scores
  • Summary statistics: mean_best, median_best, pct_above_threshold
  • Centering to remove baseline activations (focus on relative patterns)
  • Simple heatmap visualization
  • Strict deterministic tests (rtol=1e-4, atol=1e-6)
  • Python 3.9+ support with CI on 3.9, 3.10, 3.11

📦 Installation

pip install "git+https://github.com/Course-Correct-Labs/featurematch.git"

🚀 Usage (5 lines)

import torch
from featurematch.featurematch import align_features

Z_a, Z_b = torch.randn(200, 64), torch.randn(200, 64)   # same layer & dataset
res = align_features(Z_a, Z_b, topk=5, threshold=0.8)
print(res.stats)  # {'mean_best': ..., 'median_best': ..., 'pct_above_threshold': ...}

📊 Interpretation Guide

  • mean_best ≥ 0.85: strong reproducibility (dictionaries mostly aligned)
  • 0.70–0.85: partial alignment; seeds/hparams differ
  • < 0.70: low alignment; likely different learned dictionaries

📖 Demo

See notebooks/featurematch_demo.ipynb for examples with:

  • Permutation recovery (perfect alignment: mean=1.0, 100% above threshold)
  • Random baseline (low alignment: mean≈0.16, 0% above threshold)

🗺️ Roadmap

v0.1 is intentionally minimal. Future versions (demand-driven):

  • v0.2: Jaccard for binarized codes, permutation baseline
  • v0.3: Hungarian matching, (optional) global CKA

📄 License

MIT © Course Correct Labs