Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,5 @@
^CODE_OF_CONDUCT\.md$
^src/.*\.o$
^dev\.R$
^paper$
^paper$
^\.cursor$
File renamed without changes.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,5 @@ docs
*.so
src/tidyhydro.dll
tidyhydro.md

.cursor/
8 changes: 5 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,18 +1,19 @@
Package: tidyhydro
Type: Package
Title: Tidy Metrics for Assessing Hydrological Models Performance
Version: 0.1.2
Version: 0.1.2.9000
Authors@R:
person(given = "Anatoly", family = "Tsyplenkov", email = "atsyplenkov@fastmail.com", role = c("cre", "aut", "cph"), comment = c(ORCID = "0000-0003-4144-8402"))
Maintainer: Anatoly Tsyplenkov <atsyplenkov@fastmail.com>
Description: Provides tidy tools to measure the characteristics of hydrological time series and to assess the performance of hydrological models. Includes compatibility with the 'yardstick' package for model performance evaluation using commonly used metrics such as the Nash–Sutcliffe Efficiency (NSE), Kling–Gupta Efficiency (KGE), percent bias (pBIAS) and etc. Additionally provides a set of measures to calculate the descriptive statistics of a single dataset in accordance with Helsel et al. (2020). Helsel DR, Hirsch RM, Ryberg KR, Archfield SA, Gilroy EJ. Statistical methods in water resources. Reston, VA: 2020. <https://doi.org/10.3133/tm4A3>.
License: MIT + file LICENSE
Depends: R (>= 4.1.0)
Imports:
cli,
checkmate (>= 2.3.1),
Rcpp (>= 1.0.12),
rlang (>= 1.1.0),
yardstick (>= 1.3.1),
checkmate (>= 2.3.1)
LinkingTo: Rcpp
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.2
Expand All @@ -21,7 +22,8 @@ Language: en-US
Suggests:
hydroGOF,
testthat (>= 3.0.0),
quickcheck (>= 0.1.3)
quickcheck (>= 0.1.3),
dplyr
Config/testthat/edition: 3
URL: https://github.com/atsyplenkov/tidyhydro, https://atsyplenkov.github.io/tidyhydro/
BugReports: https://github.com/atsyplenkov/tidyhydro/issues
Expand Down
3 changes: 3 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ importFrom(Rcpp, evalCpp)

S3method(print, measure)
S3method(format, measure)
S3method(print, measure_set)
S3method(format, measure_set)

# general functions
export(nse)
Expand All @@ -19,6 +21,7 @@ export(press)
export(sfe)
export(cv)
export(gm)
export(measure_set)

# data.frame methods
S3method(nse, data.frame)
Expand Down
5 changes: 5 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# tidyhydro (development version)

## New features
- Added `measure_set` function to combine different descriptive statistics, of class `measure`

# tidyhydro 0.1.2

## New features
Expand Down
276 changes: 263 additions & 13 deletions R/aaa-new.R
Original file line number Diff line number Diff line change
@@ -1,25 +1,22 @@
# TODO:
# - Add hyperlink to `measure_set`
# - Add tests

# Modified after https://github.com/tidymodels/yardstick/blob/main/R/aaa-new.R

#' Construct a new measure function
#' @keywords summary_stats
#'
#' @description
#' These functions provide convenient wrappers to create the three types of
#' measure functions in `tidyhydro`: measures of central tendency, variability
#' and symmetry. They add a measure-specific class to `fn` and
#' mimic a behaviour of [metric_set][yardstick::metric_set]. These features
#' are used by measure_set.
#' descriptive statistics functions in `tidyhydro`: measures of central
#' tendency, variability and symmetry. They add a descriptive
#' statistics-specific class to `fn` and mimic a behaviour of
#' [metrics][yardstick::metrics] from `yardstick`, while are not
#' directly compatible with [metric_set][yardstick::metric_set].
#'
#' See [Custom performance
#' metrics](https://www.tidymodels.org/learn/develop/metrics/) for more
#' information about creating custom metrics.
#' In order to create a measure set, one can use [measure_set].
#'
#' @param fn A function. The measure function to attach a measure-specific class
#'
#' @seealso [measure_set]
#'
#' @name new-measure
NULL

Expand Down Expand Up @@ -70,8 +67,261 @@ format.measure <- function(x, ...) {
"tendency_measure" = "Measure of Central Tendency",
"var_measure" = "Measure of Variability",
"sym_measure" = "Measure of Distribution Symmetry",
"measure"
"measure" = "Measure"
)

paste("A", measure_type)
}

# Measure set ------------------------------------------------------------

#' Combine multiple measures into a single function
#' @keywords summary_stats
#' @family descriptive statistics
#'
#' @description
#' This function proposes a convenient wrapper to create a measure set,
#' mimicking a behaviour of [metric_set][yardstick::metric_set].
#'
#' @param ... The bare names of the functions to be included in the measure set.
#'
#' @details
#' All functions must be valid measure functions, i.e. they must be of
#' class `tendency_measure`, `var_measure` or `sym_measure`. Or created with
#' [new_tendency_measure], [new_var_measure] or [new_sym_measure].
#'
#' Alike with [metric_set], where it is not allowed to mix different metric
#' classes, it is allowed to mix different measure classes in [measure_set].
#' For example, [gm()] can be used with [cv()] because they
#' are valid measure functions even though first one is of class
#' `tendency_measure` and the second one is of class `var_measure`.
#'
#' @examples
#' \dontrun{
#' library(tidyhydro)
#'
#' # Multiple descriptive statistics
#' multi_measure <- measure_set(gm, cv)
#'
#' # The returned function has arguments:
#' # fn(data, truth, na_rm = TRUE, ...)
#' multi_measure(avacha, obs)
#'
#' avacha |>
#' group_by(month = format(date, "%b")) |>
#' multi_measure(obs)
#' }
#'
#' @export
measure_set <- function(...) {
quo_fns <- rlang::enquos(...)
validate_not_empty(quo_fns)

# Get values and check that they are fns
fns <- lapply(quo_fns, rlang::eval_tidy)
validate_inputs_are_functions(fns)

# Add on names, and then check that
# all fns are of the same function class
names(fns) <- vapply(quo_fns, get_quo_label, character(1))
validate_function_class(fns)

fn_cls <- class(fns[[1]])[[1]]

# All measure functions have the same signature
if (
fn_cls %in% c("tendency_measure", "var_measure", "sym_measure", "measure")
) {
make_measure_function(fns)
} else {
cli::cli_abort(
"{.fn validate_function_class} should have errored on unknown classes.",
.internal = TRUE
)
}
}

#' @export
print.measure_set <- function(x, ...) {
cat(format(x), sep = "\n")
invisible(x)
}

#' @export
format.measure_set <- function(x, ...) {
measures <- attr(x, "measures")
measure_names <- names(measures)

cli::cli_format_method({
cli::cli_text("A measure set, consisting of:")

for (i in seq_along(measures)) {
measure_format <- format(measures[[i]])
cli::cli_text("- {.fun {measure_names[i]}}: {measure_format}")
}
})
}

validate_not_empty <- function(x, call = rlang::caller_env()) {
if (rlang::is_empty(x)) {
cli::cli_abort(
"At least 1 function must be supplied to {.code ...}.",
call = call
)
}
}

validate_inputs_are_functions <- function(fns, call = rlang::caller_env()) {
is_fun_vec <- vapply(fns, rlang::is_function, logical(1))
all_fns <- all(is_fun_vec)

if (!all_fns) {
not_fn <- which(!is_fun_vec)
cli::cli_abort(
"All inputs to {.fn measure_set} must be functions.",
"These inputs are not: {not_fn}.",
call = call
)
}
}

# Validate that all metric functions inherit from valid function classes or
# combinations of classes
validate_function_class <- function(fns) {
fn_cls <- vapply(fns, function(fn) class(fn)[1], character(1))
fn_cls_unique <- unique(fn_cls)
n_unique <- length(fn_cls_unique)

if (n_unique == 0L) {
return(invisible(fns))
}

valid_cls <- c(
"tendency_measure",
"var_measure",
"sym_measure",
"measure"
)

# Allow mixing of different measure types
if (all(fn_cls_unique %in% valid_cls)) {
return(invisible(fns))
}

# Error handling for invalid classes
fn_bad_names <- lapply(fn_cls_unique, function(x) {
names(fns)[fn_cls == x]
})

# clean up for nicer printing
fn_cls_unique <- gsub("_measure", "", fn_cls_unique)
fn_cls_unique <- gsub("function", "other", fn_cls_unique)

fn_cls_other <- fn_cls_unique == "other"

if (any(fn_cls_other)) {
fn_cls_other_loc <- which(fn_cls_other)
fn_other_names <- fn_bad_names[[fn_cls_other_loc]]
fns_other <- fns[fn_other_names]

env_names_other <- vapply(
fns_other,
function(fn) rlang::env_name(rlang::fn_env(fn)),
character(1)
)

fn_bad_names[[fn_cls_other_loc]] <- paste0(
fn_other_names,
" <",
env_names_other,
">"
)
}

fn_pastable <- mapply(
FUN = function(fn_type, fn_names) {
fn_names <- paste0(fn_names, collapse = ", ")
paste0("- ", fn_type, " (", fn_names, ")")
},
fn_type = fn_cls_unique,
fn_names = fn_bad_names,
USE.NAMES = FALSE
)

cli::cli_abort(
c(
"x" = "The combination of measure functions must be valid measure types.",
"i" = "The following measure function types are being mixed:",
fn_pastable
),
call = rlang::call2("measure_set")
)
}

make_measure_function <- function(fns) {
measure_function <- function(data, truth, na_rm = TRUE, ...) {
# Construct common argument set for each measure call
call_args <- rlang::quos(
data = data,
truth = !!rlang::enquo(truth),
na_rm = na_rm,
... = ...
)

# Construct calls from the functions + arguments
calls <- lapply(fns, rlang::call2, !!!call_args)

# For measures, we don't need call_remove_static_arguments since
# measures don't typically have tweaked/static arguments

# Evaluate
measure_list <- mapply(
FUN = eval_safely,
calls,
names(calls),
SIMPLIFY = FALSE,
USE.NAMES = FALSE
)

dplyr::bind_rows(measure_list)
}

class(measure_function) <- c("measure_set", class(measure_function))
attr(measure_function, "measures") <- fns
measure_function
}

eval_safely <- function(expr, expr_nm, data = NULL, env = rlang::caller_env()) {
tryCatch(
expr = {
rlang::eval_tidy(expr, data = data, env = env)
},
error = function(cnd) {
cli::cli_abort(
"Failed to compute {.fn {expr_nm}}.",
parent = cnd,
call = rlang::call2("measure_set")
)
}
)
}

get_quo_label <- function(quo) {
out <- rlang::as_label(quo)

if (length(out) != 1L) {
cli::cli_abort(
"{.code as_label(quo)} resulted in a character vector of length >1.",
.internal = TRUE
)
}

is_namespaced <- grepl("::", out, fixed = TRUE)

if (is_namespaced) {
split <- strsplit(out, "::", fixed = TRUE)[[1]]
out <- split[[2]]
}

cat(paste("A", measure_type))
out
}
10 changes: 9 additions & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -64,14 +64,22 @@ hydro_metrics(avacha, obs, sim, performance = TRUE)
## Descriptive statistics
In addition to `metric`, inherited from `yardstick`, the `tidyhydro` introduces the `measure` objects. It aims to calculate descriptive statistics of a single dataset, such as `cv()` — coefficient of variation (a measure of variability) or `gm()` — geometric mean (a measure of central tendency):

```{r measureset}
```{r measures}
# Coefficient of Variation
cv(avacha, obs)

# Geometric mean
gm_vec(avacha$obs)
```

Similarly to `metric_set`, one can create a `measure_set` and estimate desired descriptive statistics at once:

```{r measureset}
ms <- measure_set(cv, gm)

ms(avacha, obs)
```

## Installation

You can install the development version of `tidyhydro` from [GitHub](https://github.com/atsyplenkov/tidyhydro) with:
Expand Down
Loading
Loading