-
Notifications
You must be signed in to change notification settings - Fork 26
Open
Description
When computing predictions for a two-class case there seems to be a mistake.
Here is a reproducible example:
library(polyreg)
library(MLmetrics)
data(kyphosis, package = "rpart")
kyphosis$y <- ifelse(kyphosis$Kyphosis == "absent", 1, 0)
kyphosis$Kyphosis <- NULL
mod <- glm(y ~ ., data = kyphosis, family = binomial())
mod
# Coefficients:
# (Intercept) Age Number Start
# 2.03693 -0.01093 -0.41060 0.20651
#
# Degrees of Freedom: 80 Total (i.e. Null); 77 Residual
# Null Deviance: 83.23
# Residual Deviance: 61.38 AIC: 69.38
table(ifelse(predict(mod, type = "response") > 0.5, 1, 0), kyphosis$y)
# 0 1
# 0 7 3
# 1 10 61
Accuracy(ifelse(predict(mod, type = "response") > 0.5, 1, 0), kyphosis$y)
# 0.8395062
table(ifelse(predict(mod) > 0.5, 1, 0), kyphosis$y)
# 0 1
# 0 10 8
# 1 7 56
Accuracy(ifelse(predict(mod) > 0.5, 1, 0), kyphosis$y)
# 0.8148148
data(kyphosis, package = "rpart")
kyphosis <- kyphosis[,c(2:4,1)]
kyphosis$Kyphosis <- as.character(kyphosis$Kyphosis)
pf <- polyFit(kyphosis, deg = 1, use = "glm")
pf$fit
# Coefficients:
# (Intercept) V1 V2 V3
# 2.03693 -0.01093 -0.41060 0.20651
#
# Degrees of Freedom: 80 Total (i.e. Null); 77 Residual
# Null Deviance: 83.23
# Residual Deviance: 61.38 AIC: 69.38
Ok the same model is fitted, but computing predictions:
table(predict(pf, kyphosis), kyphosis$Kyphosis)
# absent present
# absent 56 7
# present 8 10
Accuracy(predict(pf, kyphosis), kyphosis$Kyphosis)
# 0.8148148
seems to be wrong. Looking at the code you can see
# glm case
if (is.null(object$glmMethod)) { # only two classes
pre <- predict(object$fit, plm.newdata)
pred <- ifelse(pre > 0.5, object$classes[1], object$classes[2])
}
IMHO the prediction returned is in the link scale (see help(predict.glm)) but it should be on the probability scale, i.e. type = "response", or if in the link scale pre > 0. However, I prefer to resonate in terms of probability scale.
Metadata
Metadata
Assignees
Labels
No labels