-
Notifications
You must be signed in to change notification settings - Fork 121
Group means for transformed parameters #156
Description
@youngahn For transformed parameters (e.g., learning rates), our stan models look something like this:
parameters {
// group-level pars
real mu_A_pr;
real sigma_A;
// person-level (non-centered)
vector[N] A_pr;
}
transformed parameters {
// person-level transformed
vector[N] A;
for (i in 1:N) {
A[i] = Phi_approx(mu_A_pr + sigma_A * A_pr[i]);
}
}
Then, we later compute the group-level mean for the given parameter like so:
generated quantities {
real mu_A;
mu_A = Phi_approx(mu_A_pr);
...
}
This is actually not fully correct assuming that we want mu_A to indicate the mean across person-level parameters after undergoing the transformation. This can easily be observed with a simulation:
mu_A_pr = -1
sigma_A = 1
A_pr = rnorm(100, mu_A_pr, sigma_A)
A = pnorm(mu_A_pr + sigma_A * A_pr)
mu_A = pnorm(mu_A_pr)
actual_mu_A = mean(A)
Above, mu_A will consistently mis-estimate actual_mu_A, as it is capturing the group-level median rather than mean. The reason is outlined in #117. The fix is actually rather simple, although it requires some post-hoc simulation. Instead of simply taking mu_A = Phi_approx(mu_A_pr) in the Stan model, we could do something like the following externally:
pars = <extract parameters from Stan>
mu_A = foreach(i=1:n_MCMC_samples, .combine="c") %do% {
mu_A_pr = pars$mu_A_pr[i]
sigma_A = pars$sigma_A[i]
A_approx = pnorm(rnorm(10000, mu_A_pr, sigma_A));
mean(A_approx)
}
In the above, we simulate 10000 "people/examples" from each MCMC iteration in the group-level distribution, do the transformation, and then compute the mean of the simulated examples. 10000 is arbitrary, but higher values will produce less approximation error so it it a reasonable default. The resulting mu_A will now actually be the posterior distribution on the group-level A parameter on the (0,1) scale.
NOTE: this is only relevant for the transformed parameters. e.g., if we use Phi_approx or similar for bounded parameters. In the case of parameters that are not transformed/are unbounded, the above steps are redundant (mu_A and actual_mu_A would be the same if we did not need Phi_approx/pnorm)
NOTE 2: This is all related to #117, although the solution was not spelled out in that issue. It is not really a huge bug, but still one that should be addressed. Any comparisons that people have made between the group-level parameters are still valid, although the estimand has unintentionally been the median rather than the mean 🤓