-
Notifications
You must be signed in to change notification settings - Fork 19
Description
Hi @AndriSignorell. Thank you for the excellent DescTools package.
The SD() function exhibits some errors and unexpected (to me) behaviors when handling weights with NA values or weights where sum(weights) != length(x). It is entirely possible that my expectation are incorrect, but I did not see any details in the documentation to confirm.
Here is a reproducible example:
packageVersion("DescTools")
#> [1] '0.99.61.14'
v <- c(1, NA, 2, NA, 3, NA)
w1 <- rep(1, 6)
w2 <- c(1, NA, 1, NA, 1, NA)
# as expected
DescTools::SD(v)
#> [1] NA
# as expected
DescTools::SD(v, na.rm = TRUE)
#> [1] 1
# as expected
DescTools::SD(v, weights = w1, na.rm = TRUE)
#> [1] 1
# expected: NA; got error
DescTools::SD(v, weights = w1, na.rm = FALSE)
#> Error in `z$weights`:
#> ! $ operator is invalid for atomic vectors
# as expected
DescTools::SD(v, weights = w2, na.rm = TRUE)
#> [1] 1
# expected: NA; got error
DescTools::SD(v, weights = w2, na.rm = FALSE)
#> Error in `z$weights`:
#> ! $ operator is invalid for atomic vectors
## fractional weights <1 (all equal to 0.3333...)
# expected: 1; got Inf
DescTools::SD(v[!is.na(v)], weights = w1[!is.na(v)] / sum(!is.na(v)))
#> [1] Inf
## fractional weights >1 (all equal to 1.3333...)
# expected: 1; got 0.942809
DescTools::SD(v[!is.na(v)], weights = (w1[!is.na(v)] / sum(!is.na(v))) + 1)
#> [1] 0.942809
# test with all non-NA values but some NA weights
# not sure what behavior should be here, and doesnt appear to be covered in docs
v3 <- c(1, 2, 3, 4, 5)
w3 <- c(1, NA, 1, 1, 1)
DescTools::SD(v3, weights = w3, na.rm = TRUE)
#> Error in `z$weights`:
#> ! $ operator is invalid for atomic vectors
# works as expected with zero weights for NA values
w4 <- c(1, 0, 1, 0, 1, 0)
DescTools::SD(v, weights = w4, na.rm = TRUE)
#> [1] 1
# compare with base R weighted.mean (just to see how na.rm behaves)
weighted.mean(v, w1, na.rm = TRUE)
#> [1] 2
weighted.mean(v, w2, na.rm = TRUE)
#> [1] 2
weighted.mean(v3, w3, na.rm = TRUE)
#> [1] NAI am not sure what should happen in DescTools::SD in the following cases:
- NA values are present in weights
- Non-NA, non-zero weights correspond to NA values in x
- Weights sum to values other than
length(x)
At a minimum, I think that the Error in z$weights: ! $ operator is invalid for atomic vectors should be an error stating NA in values or weights is the problem. But probably they should run without error and produce NA?
I have not studied the code in detail, I know there are different ways to interpret weights, but I would think that having weight values summing to value other than length(x) could be normalized, similar to how base R handles it? And if all weights are the same for non-NA values, I would expect to get the same value as unweighted?
Thank you in advance for your consideration and any explanations you can provide. If you want me to submit a PR to correct the functions or documentation I would be happy to look into doing that