Skip to content

SimonErnesto/SDT_Bayesian_models

Repository files navigation

SDT Bayesian Models

The present repository contains codes for data simulation and analysis for three signal detection theory (SDT) analyses assessed in a Bayesian workflow manner.

These resources were of great help:

1) https://github.com/junpenglao/Bayesian-Cognitive-Modeling-in-Pymc3

2) https://mvuorre.github.io/posts/2017-10-09-bayesian-estimation-of-signal-detection-theory-models/#evsdt-for-multiple-participants).

Generative Model

The model is inspired by Lee and Wagenmakers (2013) approach.Two functions are relevant for this approach. These are the Gaussian cumulative density function referred as Φ, which relies on the error function erf:

erf(z) = (2÷√π)∫0zexp(-t2) dt, t = 0...z

Φ(x) = 0.5(1 + erf(x/√2))

With these a Bayesian SDT model can be built in the following way:

d ~ Normal(0, 1)

c ~ Normal(0, 1)

H = Φ(d/2 - c)

F = Φ(-d/2 - c)

yH ~ Binomial(H , s)

yF ~ Binomial(F, n)

We develop a generative model that simulates data with correlated d' and c parameters, namely with an association between sensitivity and bias (Lynn and Barret, 2014). So, the generative model in Python code looks like this:
import numpy as np
from scipy import stats
cdf = stats.norm.cdf #same as Φ

np.random.seed(33)

g = 2 #number of groups (conditions)
p = 100 #number of participants

# simulate experiment where sensitivity (d') is correlated with bias (c)
# as d' increases c decreases  
rho_high = -0.05 #correlation for high sensitivity condition
d_std = 0.5 #d' standard deviation
c_std = 0.5 #c standard deviation
mean = [2, 0.1] #d' mean (2) and c mean (0.5), i.e. high sensitivity and low bias
cov = [[d_std**2, rho_high * d_std * c_std],
       [rho_high * d_std * c_std, c_std**2]] #covariance with correlation
d_high, c_high = np.random.multivariate_normal(mean, cov, size=p).T #generate correlated variables via an mv normal
correlation_high = np.corrcoef(d_high, c_high)[0, 1]

rho_low = -0.6
d_std = 0.5
c_std = 0.5
mean = [1, 0.5]
cov = [[d_std**2, rho_low * d_std * c_std],
       [rho_low * d_std * c_std, c_std**2]]
d_low, c_low = np.random.multivariate_normal(mean, cov, size=p).T
correlation_low = np.corrcoef(d_low, c_low)[0, 1]

sig = np.array([np.repeat(25, p), np.repeat(25, p)]) #fixed number of signal trials (25)
noi = np.array([np.repeat(75, p), np.repeat(75, p)]) #fixed number of noise trials (75)

d_prime = np.array([d_high, d_low])
c_bias = np.array([c_high, c_low])

hits = np.random.binomial(sig, cdf(0.5*d_prime - c_bias)) #derive hits from d' and c
fas = np.random.binomial(noi, cdf(-0.5*d_prime - c_bias)) #derive false alarms from d' and c

Image below shows the summary of simulations as receiving operator characteristics (ROC) curves, with area under curves (AUC), and density plots for signal (hits) and noise (false alarms) cdfs.

Base Model

We first implement a Base Model with "fixed" parameters across group and participants. Where groups are group 1 (high sensitivity) and group 2 (low sensitivity).

dg,p ~ Normal(0, 1)

cg,p ~ Normal(0, 1)

hg,p = Φ(d/2 - c)

fg,p = Φ(-d/2 - c)

yH ~ Binomial(Hg,p , s)

yF ~ Binomial(Fg,p , n)

Where, g = groups (1...G, G=2), p = participants (1...P, P = 100), s = misses + hits, and n = correct rejections (CR) + false alarms (FA). Observations for yH are total hits and observations for yF are total FAs. Observations are simulated responses from the generative model presented above.

Model 1

We extend the Base Model via a non-centred parametrisation of Gaussian and half-Gaussian distributions.

dl ~ Normal(0, 1)

dz; g,p ~ Normal(0, 1)

ds ~ HalfNormal(1)

dg,p = dl + dzds

cl ~ Normal(0, 1)

cz; g,p ~ Normal(0, 1)

cs ~ HalfNormal(1)

cg,p = cl + czcs

Hg,p = Φ(d/2 - c)

Hg,p = Φ(-d/2 - c)

yH ~ Binomial(Hg,p , s)

yF ~ Binomial(Fg,p , n)

Besides the reparametrized parameters, all other elements in the model remain as in Model 1.

Model 2 (LKJ Model)

Finally, we build up a model with LKJ (see Lewandowski et al, 2009) correlation priors to assess the association between sensitivity and bias.

ρ1, ρ2 ~ LKJCorr(2, 2)

σd1, σc1, σd2, σc2 ~ HalfNormal(1)

μd1, μc1, μd2, μc2 ~ Normal(0, 1)

Σ1 = [[σ2d1, σd1σc1ρ1], [σd1σc1ρ1, σ2c1]]

Σ2 = [[σ2d2, σd2σc2ρ2], [σd2σc2ρ2, σ2c2]]

d1, c1 ~ MvNormal([μd1, μc1], Σ1)

d2, c2 ~ MvNormal([μd2, μc2], Σ2)

d = [d1, d2]

c = [c1, c2]

Hg,p = Φ(d/2 - c)

Fg,p = Φ(-d/2 - c)

yH ~ Binomial(Hg,p , s)

yF ~ Binomial(Fg,p , n)

Where LKJCorr are priors for correlations in the covariance matrices Σ1 and Σ2 and now d and c parameters are multivariate Gaussian distributions with Gaussian means and aforementioned Σ covariances.

Results

We run prior predictive checks (see prior_predictions folder), which show that priors have good coverage (see image below). We do not calibrate priors as we want to test the fixed priors from the Base Model and see how much the priors from models 1 and 2 can adapt to the "true" values of simulated d' and c.

After this, we sampled all models using Markov chain Monte Carlo (MCMC) No U-turn sampling (NUTS) with 1000 tuning steps, 1000 samples, 4 chains. You can see model_convergence folder for all details (i.e. r_hats, ess, etc.), but image below provides a general summary of posteriors and rank plots, indicating good convergence.

Next we perform a precision analysis, to see whether precision is good enough after 100 samples (i.e. 100 per group). All models reached a reasonable precision of around 0.1.

After this, we assessed posteriors via scatter plots and Hellinger H2 distance measure (see Pardo, 2018), to see how much the posterior distributions of d' and c differed from their observed (simulated) counterparts. Image below summarises these results.

ROC curves plots with AUC measures also indicate better approximations of models 2 and 3.

Additionally, we performed posterior predictive checks, which indicate similar results. In images below, SDI refers to the Sorensen-Dice Index (see Costa, 2021), which measures similarity between variables.

Finally, we conducted a models comparison after using re-LOO on each model (see Vethari et al., 2017, see also: https://python.arviz.org/en/stable/user_guide/pymc_refitting.html). Comparisons indicate higher expected log predictive densities for Model 2.

Conclusion

Results indicate that the varying (multilevel) model is an improvement over the base model, but it has minor convergence issues. Model 2, with LKJ priors does not only provides the same multilevel parametric form of Model 1 (though centred), but also includes LKJ priors capable of capturing correlations between parameters (d an c) and converges better. Further, Model 2, which is basically just an extension of previous models (i.e. only changes parametrisation, but contains no new variables), approximates simulated measures as well as Model 1 and shows better predictive accuracy.

References

Costa, L. J. (2021). Further Generalizations of the Jaccard Index. https://doi.org/10.48550/arxiv.2110.09619

Lee, M. D., & Eric-Jan Wagenmakers. (2013). Bayesian Cognitive Modeling. Cambridge University Press.

Lewandowski, D., Kurowicka, D., & Joe, H. (2009). Generating random correlation matrices based on vines and extended onion method. Journal of Multivariate Analysis, 100(9), 1989–2001. https://doi.org/10.1016/j.jmva.2009.04.008

Lynn, S. K., & Barrett, L. F. (2014). “Utilizing” Signal Detection Theory. Psychological Science, 25(9), 1663–1673. https://doi.org/10.1177/0956797614541991

Pardo, L. (2018). Statistical Inference Based on Divergence Measures. CRC Press.

Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing, 27(5), 1413–1432. https://doi.org/10.1007/s11222-016-9696-4

About

Revised Bayesian SDT workflow

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages