Skip to content

Standard errors accommodate covariance adjustment models fit using lme4 #192

@jwasserman2

Description

@jwasserman2

Goal Formulation

We want our standard error estimation routine to accommodate mixed-effects covariance adjustment models fit using lme4. To make this possible we need:

  1. To be able to create SandwichLayer objects from lme4 fitted model objects (the merMod class)
  2. To make sure our estfun.teeMod method works with the resulting objects

Making a SandwichLayer object

We first need to be able to make a PreSandwichLayer object. This requires an S3 method for the S3 generic .make_PreSandwichLayer(). The PreSandwichLayer S4 class inherits from numeric vectors and has additional slots for a fitted_covariance_model and a prediction_gradient. The fitted_covariance_model slot is simply the fitted covariance adjustment model object. The numeric vector acting as the base object is a vector of predictions from the model. Existing methods for .make_PreSandwichLayer() use the same routine for generating predictions (see the default method as an example):

  1. (Using our current parlance of $\mathcal{Q}$ for the assignment sample) Build a dataframe mf with the covariates associated with $\beta$ for units of observation in $\mathcal{Q}$
  2. Build a model matrix X from mf using the model's formula
  3. Generate model predictions from X and the model's coefficients

We've been taking this approach because X is part or all of what we store in the prediction_gradient slot. The prediction_gradient is the gradient of the estimating equations for the treatment effect with respect to the parameters estimated in the covariance adjustment models. For linear covariance adjustment models, X is the gradient with respect to the fixed effects. The gradient with respect to the covariance parameters is 0.

We need to change the code for generating X for merMod objects from other S3 methods we've written because we need to generate the argument to xlev ourselves rather than rely on it being an attribute of the fitted model object. It turns out we can extract the levels from a terms object using base R's internal function .getXlevels(). We would get xlev by running xlev <- stats:::.getXlevels(terms(model), stats::model.frame(model)), and passing xlev to the xlev argument of the stats::model.frame() call used for creating mf.

We need to do something similar for contrasts. We can get contrasts.arg by running contrasts.arg <- attr(model.matrix(model), "contrasts"), then pass that to the contrasts.arg argument of the stats::model.matrix() call for creating X.

When it comes to the predictions, though, we want to include the random effects as well, so we don't want to use just X. It also ends up being a little more work to get a matrix for making random effects predictions from a merMod object because its model.frame() and model.matrix() methods just pull the model frame and model matrix used for model fitting. In light of this, I propose just getting the predictions by linpred <- predict(model, newdata, type = "response", allow.new.levels = TRUE).

All together, the .make_PreSandwichLayer.lmerMod method could look like this:

.make_PreSandwichLayer.lmerMod <- function(model, newdata = NULL, ...) {
  model_terms <- tryCatch(
    tt <- terms(model),
    error = function(e) stop("`model` must have `terms` method", call. = FALSE)
  )

  if (is.null(newdata)) newdata <- .get_data_from_model("cov_adj", formula(model))

  xlev <- stats:::.getXlevels(tt, stats::model.frame(model))
  contrasts.arg <- attr(stats::model.matrix(lmermod), "contrasts")

  tt <- stats::delete.response(terms(model))
  mf <- stats::model.frame(tt, data = newdata, na.action = na.pass, xlev = xlev, ...)
  X <- stats::model.matrix(tt, data = mf, contrasts.arg = contrasts.arg, ...)
  dtheta <- matrix(0, nrow = nrow(X), ncol = length(model@theta) + 1) # + 1 is for the residual variance

  linpred <- predict(model, newdata, type = "response", allow.new.levels = TRUE)

  return(new("PreSandwichLayer",
             linpred,
             fitted_covariance_model = model,
             prediction_gradient = cbind(X, dtheta))
}

If we want to write a .make_PreSandwichLayer method for glmerMod objects, we just need to change the return() call to:

  return(new("PreSandwichLayer",
             stats::family(model)$linkinv(linpred),
             fitted_covariance_model = model,
             prediction_gradient = stats::family(model)$mu.eta(linpred) * X))

Now, we need as.SandwichLayer() to be able to convert a PreSandwichLayer to a SandwichLayer. It appears that we need one change to make this work: merMod objects don't have a $call element but rather an @call slot. We could write a function .get_call() that properly retrieves the call from an S3 or S4 object to patch this up. EDIT: try using getCall() for this.

Finally, we need cov_adj() to work. It has the same $call and @call issue as as.SandwichLayer(), so we'll need to use the .get_call() or other solution there. There's one other small issue as far as I can tell: there's a model.frame() call used as a fallback when no newdata argument is specified or the $\mathcal{Q}$ dataframe can't be found in an lmitt() call up the stack. I think it should have the same xlev argument written above. That code will work on other fitted model objects that have terms, so I think it should be straightforward to integrate here:

xlev <- stats:::.getXlevels(terms(model), stats::model.frame(model))
stats::model.frame(form, data, na.action = na.pass, xlev = xlev)

Based on my investigating but not testing out the code, this is all I think needs to be changed (with the exception of technical details related to making the package CRAN-acceptable).

Making sure estfun.teeMod works

The merDeriv package provides bread.merMod and estfun.merMod methods, so we don't have to write those; we'll assume the user already has them loaded.

The only change I think we need to make relates to the .get_a11_inverse() call. The bread.MerMod method from merDeriv takes a full argument to indicate whether to include columns for the parameters of the covariance matrices. We'll want full = TRUE when we call bread() on a merMod object, so we'll need:

  1. .get_a11_inverse() to take ... args that are passed on to its internal bread() call
  2. The .get_a11_inverse() call in estfun.teeMod() to have full = TRUE

With this change, I think estfun.teeMod and vcov_tee() will run as desired

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions