This repo is a spec-first XAI stack:
- Lean: defines what it means for a representation to “explain” a concept, and how those claims compose.
- Python: trains models and produces empirical witnesses (linear probe fits + error metrics) that correspond to those Lean specs.
Most practical explainability (probes, linear separability, concept vectors) is fundamentally a statement about a representation:
A concept
c(x)is explainable from a representationrep(x)if there exists a (linear) probewsuch thatw(rep(x))approximatesc(x)with bounded error.
In Lean encode that as ProbeSpecRep / ProbeSpecRepOn.
In Python fit a linear probe and compute an empirical δ̂.
VAE.lean
Minimal categorical shell + concrete specialization + probe spec.
VAEDerivations.lean
Monotonicity, transport, on-distribution specs, etc.
XAIRepresentation.lean
The bridge that subsumes:- AutoEncoders (Y = X)
- Neural Nets (arbitrary Y)
- Explainability specs living on
rep : X → Z
XAIExplainability.lean
Aliases forProbeSpecRep/ProbeSpecRepOn.XAIComposition.lean
Aliases forprecompose/pullback_latent.XAIFactorization.lean
Bridge from any end-to-endf : X → Yto a chosenrep/headfactorization.
autoencoder.pyencode(x)is the representationrep : X → Zdecode(z)is the decoderdec : Z → Xforward(x)is reconstructionrecon(x) = dec(rep(x))
neural_network.pyrepresent(x)isrep : X → Zhead(z)ishead : Z → Yforward(x) = head(represent(x))
xai_probe_utils.pyfit_linear_probe_ridge(z, c)fits a linear probe for concept predictionprobe_metrics(...)reports:- MAE, RMSE, R²
delta_max=max_i |c_i - w(z_i)|(empirical sup error δ̂)
xai_demo.pyTrains AE + RepNet on a synthetic dataset and probes a ground-truth concept from learned reps.
python xai_demo.py