Skip to content

PseudoR2 McFadden: wrong values for (successes/failures) binomial GLMs #127

@MarcRieraDominguez

Description

@MarcRieraDominguez

Hi! Congratluations on the great package!
I have come across an unexpected behaviour of the PseudoR2 function: it returns a wrong value of McFadden's Pseudo-R2 for binomial GLMs that model successes out of failures. It appears to work fine with binomial GLMs that model binary responses. I provide a reprex with overdispersed binomial GLM, it shows the same warning as my own data which are not overdispersed:
In res["Tjur"] <- unname(diff(tapply(y.hat.resp, y, mean, na.rm = TRUE))) : number of items to replace is not a multiple of replacement length
Previous reported issues with PseudoR2 did not deal with binomial GLMs explicitly, as far as I can tell: #19
Thank you for your time!!

library(DescTools)
library(DHARMa)
#> Warning: package 'DHARMa' was built under R version 4.2.3
#> This is DHARMa 0.4.6. For overview type '?DHARMa'. For recent changes, type news(package = 'DHARMa')
library(faraway)
#> Warning: package 'faraway' was built under R version 4.2.2
library(performance)
#> Warning: package 'performance' was built under R version 4.2.3


# Binomial GLM: k success out of K trials
data(troutegg, package="faraway") # Survival of trout eggs
m1 <- glm(cbind(survive,total-survive) ~ location+period, family=binomial, troutegg)
summary(m1)
#> 
#> Call:
#> glm(formula = cbind(survive, total - survive) ~ location + period, 
#>     family = binomial, data = troutegg)
#> 
#> Deviance Residuals: 
#>     Min       1Q   Median       3Q      Max  
#> -4.8305  -0.3650  -0.0303   0.6191   3.2434  
#> 
#> Coefficients:
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)   4.6358     0.2813  16.479  < 2e-16 ***
#> location2    -0.4168     0.2461  -1.694   0.0903 .  
#> location3    -1.2421     0.2194  -5.660 1.51e-08 ***
#> location4    -0.9509     0.2288  -4.157 3.23e-05 ***
#> location5    -4.6138     0.2502 -18.439  < 2e-16 ***
#> period7      -2.1702     0.2384  -9.103  < 2e-16 ***
#> period8      -2.3256     0.2429  -9.573  < 2e-16 ***
#> period11     -2.4500     0.2341 -10.466  < 2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 1021.469  on 19  degrees of freedom
#> Residual deviance:   64.495  on 12  degrees of freedom
#> AIC: 157.03
#> 
#> Number of Fisher Scoring iterations: 5

# Overdispersed binomial (depending on who you ask)
DHARMa::testDispersion(m1, alternative = "greater")

#> 
#>  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
#>  simulated
#> 
#> data:  simulationOutput
#> dispersion = 1.0411, p-value = 0.144
#> alternative hypothesis: greater
performance::check_overdispersion(m1)
#> # Overdispersion test
#> 
#>        dispersion ratio =   5.330
#>   Pearson's Chi-Squared =  63.964
#>                 p-value = < 0.001
#> Overdispersion detected.

PseudoR2(m1, which = "McFadden")
#> Warning in res["Tjur"] <- unname(diff(tapply(y.hat.resp, y, mean, na.rm =
#> TRUE))): número de items para para sustituir no es un múltiplo de la longitud
#> del reemplazo
#>  McFadden 
#> 0.9989086
1 - logLik(m1)/logLik(update(m1, . ~ 1))
#> 'log Lik.' 0.8715584 (df=8)


# Binomial GLM: 0 vs 1 (dying vs surviving the Titanic's maiden voyage)
m2 <- glm(Survived ~ ., data=Untable(Titanic), family=binomial)
summary(m2)
#> 
#> Call:
#> glm(formula = Survived ~ ., family = binomial, data = Untable(Titanic))
#> 
#> Deviance Residuals: 
#>     Min       1Q   Median       3Q      Max  
#> -2.0812  -0.7149  -0.6656   0.6858   2.1278  
#> 
#> Coefficients:
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)   0.6853     0.2730   2.510   0.0121 *  
#> Class2nd     -1.0181     0.1960  -5.194 2.05e-07 ***
#> Class3rd     -1.7778     0.1716 -10.362  < 2e-16 ***
#> ClassCrew    -0.8577     0.1573  -5.451 5.00e-08 ***
#> SexFemale     2.4201     0.1404  17.236  < 2e-16 ***
#> AgeAdult     -1.0615     0.2440  -4.350 1.36e-05 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 2769.5  on 2200  degrees of freedom
#> Residual deviance: 2210.1  on 2195  degrees of freedom
#> AIC: 2222.1
#> 
#> Number of Fisher Scoring iterations: 4


PseudoR2(m2, which = "McFadden")
#>  McFadden 
#> 0.2019875
1 - logLik(m2)/logLik(update(m2, . ~ 1))
#> 'log Lik.' 0.2019875 (df=6)


sessionInfo()
#> R version 4.2.0 (2022-04-22 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 22000)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=Spanish_Spain.utf8  LC_CTYPE=Spanish_Spain.utf8   
#> [3] LC_MONETARY=Spanish_Spain.utf8 LC_NUMERIC=C                  
#> [5] LC_TIME=Spanish_Spain.utf8    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] performance_0.10.5 faraway_1.0.8      DHARMa_0.4.6       DescTools_0.99.45 
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.10       nloptr_2.0.1      cellranger_1.1.0  compiler_4.2.0   
#>  [5] highr_0.9         class_7.3-20      tools_4.2.0       lme4_1.1-29      
#>  [9] boot_1.3-28       digest_0.6.29     nlme_3.1-157      evaluate_0.15    
#> [13] lifecycle_1.0.3   rootSolve_1.8.2.3 lattice_0.20-45   rlang_1.1.1      
#> [17] reprex_2.0.2      Matrix_1.4-1      cli_3.6.1         rstudioapi_0.13  
#> [21] yaml_2.3.5        mvtnorm_1.1-3     expm_0.999-6      xfun_0.40        
#> [25] fastmap_1.1.0     e1071_1.7-9       httr_1.4.3        withr_2.5.0      
#> [29] stringr_1.5.0     knitr_1.39        fs_1.5.2          vctrs_0.6.3      
#> [33] gld_2.6.4         grid_4.2.0        glue_1.6.2        data.table_1.14.2
#> [37] R6_2.5.1          readxl_1.4.0      lmom_2.8          rmarkdown_2.14   
#> [41] minqa_1.2.4       magrittr_2.0.3    splines_4.2.0     htmltools_0.5.6  
#> [45] MASS_7.3-57       insight_0.19.5    Exact_3.1         stringi_1.7.6    
#> [49] proxy_0.4-26

Created on 2023-10-04 with reprex v2.0.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions