Standard errors accommodate covariance adjustment models fit using lme4

## Goal Formulation
We want our standard error estimation routine to accommodate mixed-effects covariance adjustment models fit using `lme4`. To make this possible we need:
1. To be able to create `SandwichLayer` objects from `lme4` fitted model objects (the `merMod` class)
2. To make sure our `estfun.teeMod` method works with the resulting objects

### Making a `SandwichLayer` object
We first need to be able to make a `PreSandwichLayer` object. This requires an S3 method for the S3 generic `.make_PreSandwichLayer()`. The `PreSandwichLayer` S4 class inherits from numeric vectors and has additional slots for a `fitted_covariance_model` and a `prediction_gradient`. The `fitted_covariance_model` slot is simply the fitted covariance adjustment model object. The numeric vector acting as the base object is a vector of predictions from the model. Existing methods for `.make_PreSandwichLayer()` use the same routine for generating predictions (see the [default method](https://github.com/benbhansen-stats/propertee/blob/7ccf9d6e428b2347adf64081a70ca578b034eedf/R/SandwichLayer.R#L137) as an example):
1. (Using our current parlance of $\mathcal{Q}$ for the assignment sample) Build a dataframe `mf` with the covariates associated with $\beta$ for units of observation in $\mathcal{Q}$
5. Build a model matrix `X` from `mf` using the model's formula
6. Generate model predictions from `X` and the model's coefficients

We've been taking this approach because `X` is part or all of what we store in the `prediction_gradient` slot. The `prediction_gradient` is the gradient of the estimating equations for the treatment effect with respect to the parameters estimated in the covariance adjustment models. For linear covariance adjustment models, `X` is the gradient with respect to the fixed effects. The gradient with respect to the covariance parameters is 0.

We need to change the code for generating `X` for `merMod` objects from other S3 methods we've written because we need to generate the argument to `xlev` ourselves rather than rely on it being an attribute of the fitted model object. It turns out we can extract the levels from a `terms` object using base R's internal function `.getXlevels()`. We would get `xlev` by running `xlev <- stats:::.getXlevels(terms(model), stats::model.frame(model))`, and passing `xlev` to the `xlev` argument of the `stats::model.frame()` call used for creating `mf`.

We need to do something similar for `contrasts`. We can get `contrasts.arg` by running `contrasts.arg <- attr(model.matrix(model), "contrasts")`, then pass that to the `contrasts.arg` argument of the `stats::model.matrix()` call for creating `X`.

When it comes to the predictions, though, we want to include the random effects as well, so we don't want to use just `X`. It also ends up being a little more work to get a matrix for making random effects predictions from a `merMod` object because its `model.frame()` and `model.matrix()` methods just pull the model frame and model matrix used for model fitting. In light of this, I propose just getting the predictions by  `linpred <- predict(model, newdata, type = "response", allow.new.levels = TRUE)`.

All together, the `.make_PreSandwichLayer.lmerMod` method could look like this:
```r
.make_PreSandwichLayer.lmerMod <- function(model, newdata = NULL, ...) {
  model_terms <- tryCatch(
    tt <- terms(model),
    error = function(e) stop("`model` must have `terms` method", call. = FALSE)
  )

  if (is.null(newdata)) newdata <- .get_data_from_model("cov_adj", formula(model))

  xlev <- stats:::.getXlevels(tt, stats::model.frame(model))
  contrasts.arg <- attr(stats::model.matrix(lmermod), "contrasts")

  tt <- stats::delete.response(terms(model))
  mf <- stats::model.frame(tt, data = newdata, na.action = na.pass, xlev = xlev, ...)
  X <- stats::model.matrix(tt, data = mf, contrasts.arg = contrasts.arg, ...)
  dtheta <- matrix(0, nrow = nrow(X), ncol = length(model@theta) + 1) # + 1 is for the residual variance

  linpred <- predict(model, newdata, type = "response", allow.new.levels = TRUE)

  return(new("PreSandwichLayer",
             linpred,
             fitted_covariance_model = model,
             prediction_gradient = cbind(X, dtheta))
}
```

If we want to write a `.make_PreSandwichLayer` method for `glmerMod` objects, we just need to change the `return()` call to:
```r
  return(new("PreSandwichLayer",
             stats::family(model)$linkinv(linpred),
             fitted_covariance_model = model,
             prediction_gradient = stats::family(model)$mu.eta(linpred) * X))
```

Now, we need `as.SandwichLayer()` to be able to convert a `PreSandwichLayer` to a `SandwichLayer`. It appears that we need one change to make this work: `merMod` objects don't have a `$call` element but rather an `@call` slot. We could write a function `.get_call()` that properly retrieves the `call` from an S3 or S4 object to patch this up. EDIT: try using `getCall()` for this.

Finally, we need `cov_adj()` to work. It has the same `$call` and `@call` issue as `as.SandwichLayer()`, so we'll need to use the `.get_call()` or other solution there. There's one other small issue as far as I can tell: there's a `model.frame()` call used as a fallback when no `newdata` argument is specified or the $\mathcal{Q}$ dataframe can't be found in an `lmitt()` call up the stack. I think it should have the same `xlev` argument written above. That code will work on other fitted model objects that have terms, so I think it should be straightforward to integrate [here](https://github.com/benbhansen-stats/propertee/blob/c43c1d6a252792862b4e2eeb6907a72cba6aa778/R/cov_adj.R#L58):
```r
xlev <- stats:::.getXlevels(terms(model), stats::model.frame(model))
stats::model.frame(form, data, na.action = na.pass, xlev = xlev)
```

Based on my investigating but not testing out the code, this is all I think needs to be changed (with the exception of technical details related to making the package CRAN-acceptable).

### Making sure `estfun.teeMod` works
The `merDeriv` package provides `bread.merMod` and `estfun.merMod` methods, so we don't have to write those; we'll assume the user already has them loaded.

The only change I think we need to make relates to the `.get_a11_inverse()` call. The `bread.MerMod` method from `merDeriv` takes a `full` argument to indicate whether to include columns for the parameters of the covariance matrices. We'll want `full = TRUE` when we call `bread()` on a `merMod` object, so we'll need:
1. `.get_a11_inverse()` to take `...` args that are passed on to its internal `bread()` call
2. The `.get_a11_inverse()` call in `estfun.teeMod()` to have `full = TRUE`

With this change, I think `estfun.teeMod` and `vcov_tee()` will run as desired

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standard errors accommodate covariance adjustment models fit using lme4 #192

Goal Formulation

Making a `SandwichLayer` object

Making sure `estfun.teeMod` works

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Standard errors accommodate covariance adjustment models fit using lme4 #192

Description

Goal Formulation

Making a SandwichLayer object

Making sure estfun.teeMod works

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Making a `SandwichLayer` object

Making sure `estfun.teeMod` works