Goal Formulation
We want our standard error estimation routine to accommodate mixed-effects covariance adjustment models fit using lme4. To make this possible we need:
- To be able to create
SandwichLayer objects from lme4 fitted model objects (the merMod class)
- To make sure our
estfun.teeMod method works with the resulting objects
Making a SandwichLayer object
We first need to be able to make a PreSandwichLayer object. This requires an S3 method for the S3 generic .make_PreSandwichLayer(). The PreSandwichLayer S4 class inherits from numeric vectors and has additional slots for a fitted_covariance_model and a prediction_gradient. The fitted_covariance_model slot is simply the fitted covariance adjustment model object. The numeric vector acting as the base object is a vector of predictions from the model. Existing methods for .make_PreSandwichLayer() use the same routine for generating predictions (see the default method as an example):
- (Using our current parlance of $\mathcal{Q}$ for the assignment sample) Build a dataframe
mf with the covariates associated with $\beta$ for units of observation in $\mathcal{Q}$
- Build a model matrix
X from mf using the model's formula
- Generate model predictions from
X and the model's coefficients
We've been taking this approach because X is part or all of what we store in the prediction_gradient slot. The prediction_gradient is the gradient of the estimating equations for the treatment effect with respect to the parameters estimated in the covariance adjustment models. For linear covariance adjustment models, X is the gradient with respect to the fixed effects. The gradient with respect to the covariance parameters is 0.
We need to change the code for generating X for merMod objects from other S3 methods we've written because we need to generate the argument to xlev ourselves rather than rely on it being an attribute of the fitted model object. It turns out we can extract the levels from a terms object using base R's internal function .getXlevels(). We would get xlev by running xlev <- stats:::.getXlevels(terms(model), stats::model.frame(model)), and passing xlev to the xlev argument of the stats::model.frame() call used for creating mf.
We need to do something similar for contrasts. We can get contrasts.arg by running contrasts.arg <- attr(model.matrix(model), "contrasts"), then pass that to the contrasts.arg argument of the stats::model.matrix() call for creating X.
When it comes to the predictions, though, we want to include the random effects as well, so we don't want to use just X. It also ends up being a little more work to get a matrix for making random effects predictions from a merMod object because its model.frame() and model.matrix() methods just pull the model frame and model matrix used for model fitting. In light of this, I propose just getting the predictions by linpred <- predict(model, newdata, type = "response", allow.new.levels = TRUE).
All together, the .make_PreSandwichLayer.lmerMod method could look like this:
.make_PreSandwichLayer.lmerMod <- function(model, newdata = NULL, ...) {
model_terms <- tryCatch(
tt <- terms(model),
error = function(e) stop("`model` must have `terms` method", call. = FALSE)
)
if (is.null(newdata)) newdata <- .get_data_from_model("cov_adj", formula(model))
xlev <- stats:::.getXlevels(tt, stats::model.frame(model))
contrasts.arg <- attr(stats::model.matrix(lmermod), "contrasts")
tt <- stats::delete.response(terms(model))
mf <- stats::model.frame(tt, data = newdata, na.action = na.pass, xlev = xlev, ...)
X <- stats::model.matrix(tt, data = mf, contrasts.arg = contrasts.arg, ...)
dtheta <- matrix(0, nrow = nrow(X), ncol = length(model@theta) + 1) # + 1 is for the residual variance
linpred <- predict(model, newdata, type = "response", allow.new.levels = TRUE)
return(new("PreSandwichLayer",
linpred,
fitted_covariance_model = model,
prediction_gradient = cbind(X, dtheta))
}
If we want to write a .make_PreSandwichLayer method for glmerMod objects, we just need to change the return() call to:
return(new("PreSandwichLayer",
stats::family(model)$linkinv(linpred),
fitted_covariance_model = model,
prediction_gradient = stats::family(model)$mu.eta(linpred) * X))
Now, we need as.SandwichLayer() to be able to convert a PreSandwichLayer to a SandwichLayer. It appears that we need one change to make this work: merMod objects don't have a $call element but rather an @call slot. We could write a function .get_call() that properly retrieves the call from an S3 or S4 object to patch this up. EDIT: try using getCall() for this.
Finally, we need cov_adj() to work. It has the same $call and @call issue as as.SandwichLayer(), so we'll need to use the .get_call() or other solution there. There's one other small issue as far as I can tell: there's a model.frame() call used as a fallback when no newdata argument is specified or the $\mathcal{Q}$ dataframe can't be found in an lmitt() call up the stack. I think it should have the same xlev argument written above. That code will work on other fitted model objects that have terms, so I think it should be straightforward to integrate here:
xlev <- stats:::.getXlevels(terms(model), stats::model.frame(model))
stats::model.frame(form, data, na.action = na.pass, xlev = xlev)
Based on my investigating but not testing out the code, this is all I think needs to be changed (with the exception of technical details related to making the package CRAN-acceptable).
Making sure estfun.teeMod works
The merDeriv package provides bread.merMod and estfun.merMod methods, so we don't have to write those; we'll assume the user already has them loaded.
The only change I think we need to make relates to the .get_a11_inverse() call. The bread.MerMod method from merDeriv takes a full argument to indicate whether to include columns for the parameters of the covariance matrices. We'll want full = TRUE when we call bread() on a merMod object, so we'll need:
.get_a11_inverse() to take ... args that are passed on to its internal bread() call
- The
.get_a11_inverse() call in estfun.teeMod() to have full = TRUE
With this change, I think estfun.teeMod and vcov_tee() will run as desired
Goal Formulation
We want our standard error estimation routine to accommodate mixed-effects covariance adjustment models fit using
lme4. To make this possible we need:SandwichLayerobjects fromlme4fitted model objects (themerModclass)estfun.teeModmethod works with the resulting objectsMaking a
SandwichLayerobjectWe first need to be able to make a
PreSandwichLayerobject. This requires an S3 method for the S3 generic.make_PreSandwichLayer(). ThePreSandwichLayerS4 class inherits from numeric vectors and has additional slots for afitted_covariance_modeland aprediction_gradient. Thefitted_covariance_modelslot is simply the fitted covariance adjustment model object. The numeric vector acting as the base object is a vector of predictions from the model. Existing methods for.make_PreSandwichLayer()use the same routine for generating predictions (see the default method as an example):mfwith the covariates associated withXfrommfusing the model's formulaXand the model's coefficientsWe've been taking this approach because
Xis part or all of what we store in theprediction_gradientslot. Theprediction_gradientis the gradient of the estimating equations for the treatment effect with respect to the parameters estimated in the covariance adjustment models. For linear covariance adjustment models,Xis the gradient with respect to the fixed effects. The gradient with respect to the covariance parameters is 0.We need to change the code for generating
XformerModobjects from other S3 methods we've written because we need to generate the argument toxlevourselves rather than rely on it being an attribute of the fitted model object. It turns out we can extract the levels from atermsobject using base R's internal function.getXlevels(). We would getxlevby runningxlev <- stats:::.getXlevels(terms(model), stats::model.frame(model)), and passingxlevto thexlevargument of thestats::model.frame()call used for creatingmf.We need to do something similar for
contrasts. We can getcontrasts.argby runningcontrasts.arg <- attr(model.matrix(model), "contrasts"), then pass that to thecontrasts.argargument of thestats::model.matrix()call for creatingX.When it comes to the predictions, though, we want to include the random effects as well, so we don't want to use just
X. It also ends up being a little more work to get a matrix for making random effects predictions from amerModobject because itsmodel.frame()andmodel.matrix()methods just pull the model frame and model matrix used for model fitting. In light of this, I propose just getting the predictions bylinpred <- predict(model, newdata, type = "response", allow.new.levels = TRUE).All together, the
.make_PreSandwichLayer.lmerModmethod could look like this:If we want to write a
.make_PreSandwichLayermethod forglmerModobjects, we just need to change thereturn()call to:Now, we need
as.SandwichLayer()to be able to convert aPreSandwichLayerto aSandwichLayer. It appears that we need one change to make this work:merModobjects don't have a$callelement but rather an@callslot. We could write a function.get_call()that properly retrieves thecallfrom an S3 or S4 object to patch this up. EDIT: try usinggetCall()for this.Finally, we need$\mathcal{Q}$ dataframe can't be found in an
cov_adj()to work. It has the same$calland@callissue asas.SandwichLayer(), so we'll need to use the.get_call()or other solution there. There's one other small issue as far as I can tell: there's amodel.frame()call used as a fallback when nonewdataargument is specified or thelmitt()call up the stack. I think it should have the samexlevargument written above. That code will work on other fitted model objects that have terms, so I think it should be straightforward to integrate here:Based on my investigating but not testing out the code, this is all I think needs to be changed (with the exception of technical details related to making the package CRAN-acceptable).
Making sure
estfun.teeModworksThe
merDerivpackage providesbread.merModandestfun.merModmethods, so we don't have to write those; we'll assume the user already has them loaded.The only change I think we need to make relates to the
.get_a11_inverse()call. Thebread.MerModmethod frommerDerivtakes afullargument to indicate whether to include columns for the parameters of the covariance matrices. We'll wantfull = TRUEwhen we callbread()on amerModobject, so we'll need:.get_a11_inverse()to take...args that are passed on to its internalbread()call.get_a11_inverse()call inestfun.teeMod()to havefull = TRUEWith this change, I think
estfun.teeModandvcov_tee()will run as desired