-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Issue submitted via e-mail:
I may have found an edge case for which p-values inconsistent with the confidence interval are returned. Here is a minimal reproducible example along with a guess as to what the cause is.
test_df <- data.frame("x" = c(rep(-1, 3),
rep(0, 14),
rep(1, 3)))
statistic <- function(data, indices) mean(data$x[indices])
set.seed(24601)
boot_res <- boot(test_df, statistic, 1000)
table(boot_res$t)
-0.4 -0.35 -0.3 -0.25 -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2
1 5 5 19 46 66 109 166 169 155 96 78 60
0.25 0.3 0.35 0.5
16 6 2 1
boot.ci(boot_res, type = "perc")
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1000 bootstrap replicates
CALL :
boot.ci(boot.out = boot_res, type = "perc")
Intervals :
Level Percentile
95% (-0.2500, 0.2487 )
Calculations and Intervals on Original Scale
boot.pval(boot_res, type = "perc")
[1] 0.001
The confidence interval contains zero, but the p-value comes back as highly significant.
pval_precision <- NULL
type <- "perc"
theta_null <- 0
if (is.null(pval_precision)) {
pval_precision = 1 / boot_res$R
}
alpha_seq <- seq(1e-16, 1 - 1e-16, pval_precision)
ci <- boot::boot.ci(boot_res,
conf = 1 - alpha_seq, type = type)
Warning in norm.inter(t, alpha): extreme order statistics used as endpoints
bounds <-
switch(
type,
norm = ci$normal[, 2:3],
basic = ci$basic[,
4:5],
stud = ci$student[, 4:5],
perc = ci$percent[, 4:5],
bca = ci$bca[, 4:5]
)
alpha <- alpha_seq[which.min(theta_null >= bounds[, 1] &
theta_null <= bounds[, 2])]
The problem is which.min(theta_null >= bounds[, 1] & theta_null <= bounds[, 2]). I believe the intention is to find the first FALSE value, but in this case there are no false values, and it’s picking up the first non-NA value in alpha_seq and falsely communicating high significance.