One of the ways lmitt() diverges from lm() is that the environment associated with a fitted model's formula (accessed via environment(formula(model_object)) or attr(model_object$terms, ".Environment") is not the environment in which the model object was created. Instead, it's associated with a unique environment that's specially curated and pared down to include only the objects needed for inference. Methods for standard errors than search for these objects in that environment.
Though this is neat (in the sense of organization) and safe (in the sense that changes to dataframes in the global environment don't affect standard error calculations), it can be memory intensive when many fitted models are desired, especially ones that also include covariance adjustment through the offset argument. My own experience estimating 8 covariate-adjusted treatment effects with propertee was fairly laborious as a result. Note lm() faces the same issue if it is called within a custom function written for fitting models (and fares even worse than lmitt(), in fact, since it retains all objects existing in the function environment).
A way to mitigate this issue is to create environments with data and specification objects that apply to multiple teeMods; in the ideal case, one stores these objects in the global environment. A vignette demonstrating how to do this would be beneficial to users. Topics include:
- Ensuring all columns needed for fitting different models are stored in
data
- Suggesting the use of the
subset argument over passing a subsetted dataframe to the data argument
- Using calls to
ate()/ett()/... in the weights argument and cov_adj() in the offset argument rather than passing already created weight and prediction vectors
@kkbrum add suggestions here based on your experience as well!
One of the ways
lmitt()diverges fromlm()is that the environment associated with a fitted model's formula (accessed viaenvironment(formula(model_object))orattr(model_object$terms, ".Environment")is not the environment in which the model object was created. Instead, it's associated with a unique environment that's specially curated and pared down to include only the objects needed for inference. Methods for standard errors than search for these objects in that environment.Though this is neat (in the sense of organization) and safe (in the sense that changes to dataframes in the global environment don't affect standard error calculations), it can be memory intensive when many fitted models are desired, especially ones that also include covariance adjustment through the
offsetargument. My own experience estimating 8 covariate-adjusted treatment effects withproperteewas fairly laborious as a result. Notelm()faces the same issue if it is called within a custom function written for fitting models (and fares even worse thanlmitt(), in fact, since it retains all objects existing in the function environment).A way to mitigate this issue is to create environments with
dataandspecificationobjects that apply to multipleteeMods; in the ideal case, one stores these objects in the global environment. A vignette demonstrating how to do this would be beneficial to users. Topics include:datasubsetargument over passing a subsetted dataframe to thedataargumentate()/ett()/...in theweightsargument andcov_adj()in theoffsetargument rather than passing already created weight and prediction vectors@kkbrum add suggestions here based on your experience as well!