Skip to content

Discussion: Make psyphy feel more scipy like? Why we have a Posterior module (and how it differs from typical ML APIs) #21

@hmd101

Description

@hmd101

TL;DR: The Posterior module exists because it’s scientifically correct and flexible. But we may want to add a light fit/predict shorthand so psyphy feels more like ML libraries, without losing its Bayesian clarity.

Background

In psyphy, inference routines (MAPOptimizer, LaplaceOptimizer, MCMC, etc.) return Posterior objects (MAPPosterior, LaplacePosterior, …).
These posterior classes provide methods like:

  • predict_prob(X)
  • predict_thresholds(level=...)
  • sample(n=...)
  • (and more model-specific summaries)

This design makes psyphy feel different from scikit-learn, PyTorch, or SciPy, where users typically expect:

model.fit(data)
preds = model.predict(X)

Why a dedicated Posterior abstraction?

  • Inference flexibility
    Different inference methods produce very different outputs:

    • MAP → point estimate
    • Laplace → Gaussian approx
    • MCMC → samples
      Having dedicated posterior classes keeps their semantics clear, while offering a unified API (predict_prob, predict_thresholds).
  • Extensibility
    Trial placement strategies and experiment sessions can interact with any posterior type without needing to know how it was fit.

  • Scientific clarity:
    In Bayesian modeling, the posterior is conceptually distinct from the model.
    Keeping it explicit emphasizes that distinction and avoids overloading model.predict with too many meanings.

  • Lastly, established Bayesian optimization libraries, such as BoTorch also abstract the Posterior away

Current psyphy style:

from psyphy import WPPM, Prior, OddityTask, Noise, MAPOptimizer

# Explicit model + inference setup
model = WPPM(prior=Prior.default(2), task=OddityTask(), noise=Noise.default())
opt = MAPOptimizer(model)
posterior = opt.fit(data)

# Use posterior directly
probs = posterior.predict_prob(X)
thresholds = posterior.predict_thresholds(0.75)

Why this might feel unfamiliar

  • In ML frameworks (scikit-learn, PyTorch), users rarely interact with “posterior” objects
  • Instead, the model itself stores fitted parameters and exposes predict
  • -> this can make psyphy feel verbose or “non-standard” to ML users

Potential ways forward

I think we don’t want to lose the scientific clarity and extensibility of the Posterior module — but we can make psyphy friendlier by layering a lightweight ML-like API on top:

Option 1: ML-style shorthand

model = WPPM.default(input_dim=2)
posterior = model.fit(data, method="MAP")   # internally returns Posterior
y_pred = model.predict(X)                   # calls posterior.predict_prob
thresholds = model.predict_thresholds(0.75)
  • keeps Posterior under the hood.
  • adds familiar fit/predict entry point for new users familiar with PyTorch/ scipy

Option 2: Results objects

Introduce FitResult that bundles model, posterior, parameters, and goodness-of-fit:

result = pp.fit_wppm(data)
print(result.params)
posterior = result.posterior

Option 3: Do nothing (status quo)

Keep the Posterior module as the primary interface, and focus on documentation to explain why this is different from ML libraries.

Open questions

  • should we prioritize familiarity (ML-style API) or clarity (explicit Posterior objects)?
  • Would a dual API (shorthand + explicit) be worth maintaining?

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedExtra attention is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions