Group means for transformed parameters

@youngahn For transformed parameters (e.g., learning rates), our stan models look something like this: 

```{stan}
parameters {
  // group-level pars
  real mu_A_pr;
  real sigma_A;
  
  // person-level (non-centered)
  vector[N] A_pr;
}

transformed parameters {
  // person-level transformed
  vector[N] A;
  
  for (i in 1:N) {
    A[i] = Phi_approx(mu_A_pr + sigma_A * A_pr[i]);
  }
 }
```

Then, we later compute the group-level mean for the given parameter like so: 

```{stan}
generated quantities {
  real  mu_A;
  mu_A = Phi_approx(mu_A_pr);
  ...
}
```

This is actually not fully correct assuming that we want `mu_A` to indicate the mean across person-level parameters after undergoing the transformation. This can easily be observed with a simulation: 

```{r}
mu_A_pr = -1
sigma_A = 1

A_pr = rnorm(100, mu_A_pr, sigma_A)

A = pnorm(mu_A_pr + sigma_A * A_pr)

mu_A = pnorm(mu_A_pr)
actual_mu_A = mean(A)
```

Above, `mu_A` will consistently mis-estimate `actual_mu_A`, as it is capturing the group-level median rather than mean. The reason is outlined in https://github.com/CCS-Lab/hBayesDM/issues/117. The fix is actually rather simple, although it requires some post-hoc simulation. Instead of simply taking `mu_A = Phi_approx(mu_A_pr)` in the Stan model, we could do something like the following externally: 

```{r}
pars = <extract parameters from Stan>

mu_A = foreach(i=1:n_MCMC_samples, .combine="c") %do% {
  mu_A_pr = pars$mu_A_pr[i] 
  sigma_A = pars$sigma_A[i] 
  A_approx = pnorm(rnorm(10000, mu_A_pr, sigma_A)); 
  
  mean(A_approx)
}
```

In the above, we simulate `10000` "people/examples" from each MCMC iteration in the group-level distribution, do the transformation, and then compute the mean of the simulated examples. `10000` is arbitrary, but higher values will produce less approximation error so it it a reasonable default. The resulting `mu_A` will now actually be the posterior distribution on the group-level `A` parameter on the `(0,1)` scale. 

NOTE: this is only relevant for the transformed parameters. e.g., if we use `Phi_approx` or similar for bounded parameters. In the case of parameters that are not transformed/are unbounded, the above steps are redundant (`mu_A` and `actual_mu_A` would be the same if we did not need `Phi_approx/pnorm`)

NOTE 2: This is all related to https://github.com/CCS-Lab/hBayesDM/issues/117, although the solution was not spelled out in that issue. It is not really a huge bug, but still one that should be addressed. Any comparisons that people have made between the group-level parameters are still valid, although the estimand has unintentionally been the median rather than the mean 🤓  

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Group means for transformed parameters #156

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Group means for transformed parameters #156

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions