-
Notifications
You must be signed in to change notification settings - Fork 71
Suggested error catch and message #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
And a suggested error message for invalid coefficients in the covmodels formulae: devtools::load_all()
#> ℹ Loading gfoRmula
library('Hmisc')
#>
#> Attaching package: 'Hmisc'
#>
#> The following objects are masked from 'package:base':
#>
#> format.pval, units
id <- 'id'
time_name <- 't0'
covnames <- c('L1', 'L2', 'A')
outcome_name <- 'Y'
outcome_type <- 'continuous_eof'
covtypes <- c('categorical', 'normal', 'binary')
histories <- c(lagged)
histvars <- list(c('A', 'L1', 'L2'))
covparams <- list(covmodels = c(L1 ~ lag1_A + lag1_L1 + L3 + t0 +
rcspline.eval(lag1_L2, knots = c(-1, 0, 1)),
L2 ~ lag1_A + L1 + stats::poly(L3, 2) + lag1_L1 + lag1_L2 + L3 + t0,
A ~ lag1_A + L1 + L2 + lag1_L1 + lag1_L2 + L3 + t0))
ymodel <- Y ~ A + L1 + L2 + lag1_A + lag1_L1 + lag1_L2 + L3
intvars <- list('A', 'A')
interventions <- list(list(c(static, rep(0, 7))),
list(c(static, rep(1, 7))))
int_descript <- c('Never treat', 'Always treat')
nsimul <- 10000
gform_cont_eof <- gformula(obs_data = continuous_eofdata,
id = id, time_name = time_name,
covnames = covnames, outcome_name = outcome_name,
outcome_type = outcome_type, covtypes = covtypes,
covparams = covparams, ymodel = ymodel,
intvars = intvars, interventions = interventions,
int_descript = int_descript,
histories = histories, histvars = histvars,
basecovs = c("L3"), nsimul = nsimul, seed = 1234)
#> Error in simulate(fitcov = fitcov, fitY = fitY, fitD = NA, yrestrictions = yrestrictions, : `NA` coefficients produced in covariate prediction model. This may be due to (multi)collinearity in `covmodels` predictor variables, or including `time_name` variable in a 'continuous_eof' model.
#> Variables returning `NA` coefficients:
#> L2
#> - L3 |
|
HI, I will check, but part of the problem might be that in R, glm does not check for the full rank of the design matrix being used in the supplied model. When including time in an eof type outcome model, this variable will be a multiple of the column for the intercept and the design matrix will not have full rank. When include cumsum of A and all lags of A, the cumsum is a linear function of the lags, so again the design matrix will not have full rank. I am not sure about having a check for including variables that are not included in the basecov or covnames vectors. For the various models being used we might be able to include a check for looking at the rank of the design matrix for each model being used. This might help, but might not be able to capture all problems. Would it be able to capture the error for L3 being in a model/data and not being included in the covnames or basecovs list. Thanks again for your suggested help. |
|
Thanks for the response Roger. This suggestion of course comes with my full admission of my very basic understanding of what should be happening within the model and how to fix it, so your expert oversight on all of this is much appreciated! To clarify, with the time problem above for example, is the problem caused by the user's incorrect specification of the time variable in the model, rather than an underlying bug? My proposed message here is to tell people like me to not be silly and put such things in there (if they're not meant to be), but if it's hinting at a deeper underlying problem then that may need a different approach. The simulated 'L3' problem above (again one that I've tripped over in my stupidity) is created by having both 'L3' and 'stats::poly(L3, 2)' in the 'L2' covmodel. R (sensibly) sees that 'L3' and 'L3^1' are identical and returns 'NA' for one of them (which works fine in a normal R context). So it hasn't caught an error before this (as L3 is in In both cases I've tried a simple check of all coefficients of the models in each case, on the basis that one NA makes the prediction parts crash with a cryptic message. The user, I imagine, should be told to take the redundant variables out. This might be the wrong place to check though, or the wrong message/solution. Do edit/reject/request edits to suggested code as helpful. I really appreciate this package btw, and do please excuse any confusions and other questions that come your way as I get my head around it a bit more! |
|
Hi Andrew, Thanks for posting this as well! To your point, I think it'd be good if we put each instance of model fitting (ie, each covariate model, outcome model) in a try-catch block and also check for issues such as NA coefficients, etc. Otherwise, as you describe, cryptic error messages can appear downstream in the analyses. I'll try to make these updates within the next week or two Best, |
Hello, I came across the same error message as in #20:
This was being produced when
NAcoefficients were being returned bystats::glminpred_fun_Y, which then breaksstats::predictin 'simulate.R'. This only happened when:ymodelin a "continuous_eof" model. I think I follow that this is because the model is being fitted at a single time point, therefore there's no variation in time across observations.~ A + stats::poly(A, 2)), as R assignedNAto one of the duplicates (AorA^1)So, my fault all round and I just need to understand the model better!
My suggestion here is to have a clearer message for tracking where this error is being produced and what might be causing it. A potential solution would be adding such an error checker to give clear suggestions. A rough skeleton could be:
"Error: NA coefficients produced in prediction model. This may be due to (multi)collinearity in ymodel predictor variables, or including time_name variable in a 'continuous_eof' model."That's just me hazarding a guess though. There may be clearer messages it'd be helpful to give, or even a more generic message merely that the model didn't provide a fit.
Example run with new error (produced by including
t0inymodel):