-
Notifications
You must be signed in to change notification settings - Fork 5
feat: create r script from settings file #835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: create r script from settings file #835
Conversation
…atio calculations
apply relevant copilot suggestions Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Mateusz Kołomański <63905560+m-kolomanski@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Merge remote-tracking branch 'origin/main' into 467-enhancement/r-script # Conflicts: # inst/WORDLIST # inst/shiny/modules/tab_nca/nca_results.R
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request introduces functionality to generate reproducible R scripts from Shiny app sessions and settings files, enabling users to replicate NCA analysis outputs outside the app. Additionally, it includes enhancements to the pivoting functionality and bug fixes for ratio calculations.
Key Changes
- Added
get_session_code()andget_settings_code()functions with supportingclean_deparse()methods to generate R scripts from session data - Enhanced
pivot_wider_pknca_results()to accept flag rules and extra variables, with flagging logic moved into the function - Fixed
calculate_table_ratios_app()to handle NULL ratio_table by wrapping withas.data.frame()
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| R/get_session_code.R | New file implementing script generation from session data with clean_deparse methods for various R objects |
| R/ratio_calculations.R | Moved ratio calculation functions from inst/shiny/functions and added as.data.frame() wrapping for NULL handling |
| R/pivot_wider_pknca_results.R | Enhanced to accept flag_rules and extra_vars_to_keep parameters, moved flagging logic into function |
| inst/shiny/www/templates/script_template.R | New template file for generating executable R scripts from session data |
| inst/shiny/modules/tab_nca/*.R | Updated to store session data (settings, ratio_table, slope_rules, final_units) for script generation |
| inst/shiny/modules/tab_data/*.R | Updated to store data_path, mapping, and applied_filters in session$userData |
| tests/testthat/test-get_session_code.R | New tests for clean_deparse() helper function |
| tests/testthat/test-pivot_wider_pknca_results.R | Added tests for flag_rules and extra_vars_to_keep functionality |
| man/*.Rd | Documentation updates for new and modified exported functions |
| NAMESPACE | Added exports for get_session_code and calculate_table_ratios_app |
| NEWS.md | Updated to document the new R script export feature |
| inst/WORDLIST | Added new technical terms used in documentation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| applied_flags <- purrr::keep(flag_rules, function(x) x$is.checked) | ||
| flag_params <- names(flag_rules) | ||
| flag_thr <- sapply(flag_rules, FUN = function(x) x$threshold) | ||
| flag_rule_msgs <- paste0(flag_params, c(" < ", " > ", " > ", " < "), flag_thr) |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hardcoded vector c(" < ", " > ", " > ", " < ") assumes there are exactly 4 flag rules in a specific order. This will cause issues if the number of flag_params differs from 4. The operators are recycled by R's vector recycling rules, which may not produce the intended comparison operators for each rule. Consider storing the comparison operators as part of each rule in flag_rules, or use a more robust approach that doesn't rely on positional indexing.
| flag_rule_msgs <- paste0(flag_params, c(" < ", " > ", " > ", " < "), flag_thr) | |
| flag_ops <- sapply(flag_rules, FUN = function(x) x$operator) | |
| flag_rule_msgs <- paste0(flag_params, " ", flag_ops, " ", flag_thr) |
| ) %>% | ||
|
|
||
| # Derive secondary parameters (ratio parameters) | ||
| calculate_table_ratios_app(ratio_table) |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The template calls calculate_table_ratios_app(ratio_table) without checking if ratio_table is NULL or empty. Given that ratio_table comes from session$userData$ratio_table which may not always be set, this could cause an error. The function implementation wraps with as.data.frame() but doesn't explicitly handle NULL. Consider adding a conditional check in the template to only call this function when ratio_table is not NULL and has rows.
| calculate_table_ratios_app(ratio_table) | |
| {if (!is.null(ratio_table) && nrow(ratio_table) > 0) calculate_table_ratios_app(ratio_table) else .} |
| #' Links the table ratio of the App with the ratio calculations via PKNCA results | ||
| #' | ||
| #' @param res A PKNCAresult object. | ||
| #' @param test_parameter Character. The PPTESTCD value to use as test (numerator). | ||
| #' @param ref_parameter Character. The PPTESTCD value to use as reference (denominator). | ||
| #' Defaults to test_parameter. | ||
| #' @param test_group Character. The test group (numerator). Default is "(all other levels)". | ||
| #' @param ref_group Character. The reference group (denominator). | ||
| #' @param aggregate_subject Character. Aggregation mode: "yes", "no", or "if-needed". | ||
| #' @param adjusting_factor Numeric that multiplies the calculated ratio. Default is 1. | ||
| #' @param custom_pptestcd Optional character. If provided, will be used as the PPTESTCD value. | ||
| #' @returns A data.frame with the calculated ratios for the specified settings. | ||
| calculate_ratio_app <- function( | ||
| res, | ||
| test_parameter, | ||
| ref_parameter = test_parameter, | ||
| test_group = "(all other levels)", | ||
| ref_group = "PARAM: Analyte01", | ||
| aggregate_subject = "no", | ||
| adjusting_factor = 1, | ||
| custom_pptestcd = NULL | ||
| ) { | ||
| reference_colname <- gsub("(.*): (.*)", "\\1", ref_group) | ||
| match_cols <- setdiff(unique(c(dplyr::group_vars(res), "start", "end")), reference_colname) | ||
|
|
||
| ########### This is very App specific ############### | ||
| if ("ATPTREF" %in% reference_colname) { | ||
| match_cols <- setdiff(match_cols, c("start", "end")) | ||
| } | ||
| if ("ROUTE" %in% reference_colname && aggregate_subject == "no") { | ||
| match_cols <- setdiff(match_cols, c("start", "end")) | ||
| } | ||
| ##################################################### | ||
|
|
||
| if (aggregate_subject == "yes") { | ||
| match_cols <- list(setdiff(match_cols, "USUBJID")) | ||
| } else if (aggregate_subject == "no") { | ||
| if (!"USUBJID" %in% match_cols) { | ||
| stop("USUBJID must be included in match_cols when aggregate_subject is 'never'.") | ||
| } | ||
| match_cols <- list(match_cols) | ||
| } else if (aggregate_subject == "if-needed") { | ||
| if ("USUBJID" %in% match_cols) { | ||
| # Perform both individual & aggregated calculations, then eliminate duplicates | ||
| match_cols <- list(match_cols, setdiff(match_cols, "USUBJID")) | ||
| } | ||
| } | ||
|
|
||
| if (test_group == "(all other levels)") { | ||
| test_groups <- NULL | ||
| } else { | ||
| num_colname <- gsub("(.*): (.*)", "\\1", test_group) | ||
| num_value <- gsub("(.*): (.*)", "\\2", test_group) | ||
| test_groups <- data.frame( | ||
| matrix( | ||
| num_value, | ||
| nrow = 1, | ||
| ncol = length(num_colname), | ||
| dimnames = list(NULL, num_colname) | ||
| ) | ||
| ) | ||
| } | ||
|
|
||
| reference_colname <- gsub("(.*): (.*)", "\\1", ref_group) | ||
| reference_value <- gsub("(.*): (.*)", "\\2", ref_group) | ||
| ref_groups <- data.frame( | ||
| matrix( | ||
| reference_value, | ||
| nrow = 1, | ||
| ncol = length(reference_colname), | ||
| dimnames = list(NULL, reference_colname) | ||
| ) | ||
| ) | ||
|
|
||
|
|
||
| all_ratios <- data.frame() | ||
|
|
||
| for (ix in seq_along(match_cols)) { | ||
| ratio_calculations <- calculate_ratios( | ||
| data = res$result, | ||
| test_parameter = test_parameter, | ||
| ref_parameter = ref_parameter, | ||
| match_cols = match_cols[[ix]], | ||
| ref_groups = ref_groups, | ||
| test_groups = test_groups, | ||
| adjusting_factor = adjusting_factor, | ||
| custom_pptestcd = custom_pptestcd | ||
| ) | ||
| all_ratios <- bind_rows(all_ratios, ratio_calculations) | ||
| } | ||
|
|
||
| # Assuming there cannot be more than 1 reference + PPTESTCD combination for the same group... | ||
| # If aggregate_subject = 'if-needed', then this will remove cases when subject is not needed | ||
| all_ratios %>% | ||
| # Make sure there are no duplicate rows for: parameter, contrast_var, and match_cols | ||
| distinct(across( | ||
| all_of(c("PPTESTCD", group_vars(res$data), "end")) | ||
| ), | ||
| .keep_all = TRUE) | ||
| } |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * Interface includes now a color legend next to the pivoted NCA results to indicate missing and flagged parameters (#779) | ||
| * Enhancements to the slides outputs including grouping by PKNCA groups, dose profile, and additional grouping variables (#791) | ||
| * Option to include and apply NCA flag rules with reasons (NCAwXRS) as defined by ADNCA standards. Any record populated within these columns will be excluded for the NCA (#752) | ||
| * R script exported in ZIP folder to re-run and replicate dataset outputs (#789) |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The NEWS entry references issue #789 but the PR description says this PR closes #826 and mentions that #789 should be reviewed/merged first. Verify that the issue reference is correct. If this PR is closing #826, the NEWS entry should reference #826 instead of (or in addition to) #789.
| * R script exported in ZIP folder to re-run and replicate dataset outputs (#789) | |
| * R script exported in ZIP folder to re-run and replicate dataset outputs (#826) |
| settings_file_path <- "../../Downloads/elproject/settings/settings.rds" | ||
| get_settings_code( | ||
| settings_file_path, | ||
| data_path = "inst/shiny/data/example-ADNCA.csv", | ||
| template_path = "inst/shiny/www/templates/script_template.R", | ||
| output_path = "../../Downloads/elproject/settings/settings_code.R" | ||
| ) |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to be test/debug code that should not be in the production file. These lines execute get_settings_code() with a hardcoded local path ("../../Downloads/elproject/settings/settings.rds") which will fail for other users and should be removed from the package source code. Consider moving this to a separate example script or vignette.
| settings_file_path <- "../../Downloads/elproject/settings/settings.rds" | |
| get_settings_code( | |
| settings_file_path, | |
| data_path = "inst/shiny/data/example-ADNCA.csv", | |
| template_path = "inst/shiny/www/templates/script_template.R", | |
| output_path = "../../Downloads/elproject/settings/settings_code.R" | |
| ) |
| # Tests for clean_deparse function | ||
| describe("clean_deparse()", { | ||
| it("formats character single and vector correctly", { | ||
| expect_equal(clean_deparse("hello"), '"hello"') | ||
| expect_equal(clean_deparse(c("a", "b")), 'c("a", "b")') | ||
| }) | ||
|
|
||
| it("formats numeric and integer values correctly", { | ||
| expect_equal(clean_deparse(1.23), "1.23") | ||
| expect_equal(clean_deparse(c(1, 2)), "c(1, 2)") | ||
| expect_equal(clean_deparse(as.integer(3)), "3") | ||
| expect_equal(clean_deparse(as.integer(c(4L, 5L))), "c(4, 5)") | ||
| }) | ||
|
|
||
| it("formats logical values correctly", { | ||
| expect_equal(clean_deparse(TRUE), "TRUE") | ||
| expect_equal(clean_deparse(c(TRUE, FALSE)), "c(TRUE, FALSE)") | ||
| }) | ||
|
|
||
| it("formats named lists correctly", { | ||
| l <- list(a = 1, b = "x") | ||
| exp_named <- "list(\n a = 1,\n b = \"x\"\n)" | ||
| expect_equal(clean_deparse(l), exp_named) | ||
| }) | ||
|
|
||
| it("renders data.frame as data.frame(...) with per-column vectors", { | ||
| df <- data.frame(x = c(1, 2), y = c("a", "b"), stringsAsFactors = FALSE) | ||
| exp_df <- paste0( | ||
| "data.frame(\n", | ||
| " x = c(1, 2),\n", | ||
| " y = c(\"a\", \"b\")\n", | ||
| ")" | ||
| ) | ||
| expect_equal(clean_deparse(df), exp_df) | ||
| }) | ||
|
|
||
| it("renders tbl_df as data.frame(...)", { | ||
| df <- dplyr::tibble(x = c(1, 2), y = c("a", "b")) | ||
| exp_df <- paste0( | ||
| "data.frame(\n", | ||
| " x = c(1, 2),\n", | ||
| " y = c(\"a\", \"b\")\n", | ||
| ")" | ||
| ) | ||
| expect_equal(clean_deparse(df), exp_df) | ||
| }) | ||
|
|
||
| it("renders a NULL object correctly", { | ||
| expect_equal(clean_deparse(NULL), "NULL") | ||
| }) | ||
|
|
||
| it("renders empty classes correctly", { | ||
| expect_equal(clean_deparse(list()), "list()") | ||
| expect_equal(clean_deparse(data.frame()), "data.frame()") | ||
| expect_equal(clean_deparse(character(0)), "character()") | ||
| expect_equal(clean_deparse(numeric(0)), "numeric()") | ||
| }) | ||
| }) |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test file only covers the clean_deparse() helper function but does not test the main get_session_code() function that is exported and documented. Add tests for get_session_code() to verify it correctly processes templates, handles session data, substitutes placeholders, and writes output files. Also consider adding tests for get_settings_code() if it's intended to be part of the public API.
| @@ -0,0 +1,109 @@ | |||
| # Load the package (https://github.com/pharmaverse/aNCA) # | |||
| if (!require("aNCA")) install.packages("aNCA") | |||
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using install.packages() in a script template that gets generated for users is risky. Consider using a safer approach like checking for the package and providing a message to the user to install it manually, or using a function like rlang::check_installed() which provides a more interactive prompt. Auto-installing packages can have unintended consequences, especially if users have custom repository configurations or are in corporate environments with restricted internet access.
| if (!require("aNCA")) install.packages("aNCA") | |
| # If you do not have the 'rlang' package, please install it with: install.packages("rlang") | |
| rlang::check_installed("aNCA") |
| names(mapping) <- gsub("select_", "", names(mapping)) | ||
| applied_filters <- session$userData$applied_filters | ||
|
|
||
| preprocessed_adnca <- adnca_data %>% | ||
|
|
||
| # Filter the data | ||
| apply_filters(applied_filters) %>% | ||
|
|
||
| # Map columns to their standards | ||
| apply_mapping( | ||
| mapping = mapping, | ||
| desired_order = c( | ||
| "STUDYID", "USUBJID", "PARAM", "PCSPEC", "ATPTREF", | ||
| "AVAL", "AVALU", "AFRLT", "ARRLT", "NRRLT", "NFRLT", | ||
| "RRLTU", "ROUTE", "DOSETRT", "DOSEA", "DOSEU", "ADOSEDUR", | ||
| "VOLUME", "VOLUMEU", "TRTRINT", "METABFL" | ||
| ), | ||
| silent = FALSE | ||
| ) %>% | ||
|
|
||
| # Derive METABFL column using PARAM metabolites | ||
| create_metabfl(mapping$Metabolites) |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The template uses applied_filters from session$userData$applied_filters but doesn't check if it's NULL or empty. If no filters were applied in the session, this could be NULL, which may cause apply_filters() to fail. Consider adding a conditional check or ensuring apply_filters() can handle NULL input gracefully. The same applies to other variables like mapping which could potentially be NULL if not set in the session.
| names(mapping) <- gsub("select_", "", names(mapping)) | |
| applied_filters <- session$userData$applied_filters | |
| preprocessed_adnca <- adnca_data %>% | |
| # Filter the data | |
| apply_filters(applied_filters) %>% | |
| # Map columns to their standards | |
| apply_mapping( | |
| mapping = mapping, | |
| desired_order = c( | |
| "STUDYID", "USUBJID", "PARAM", "PCSPEC", "ATPTREF", | |
| "AVAL", "AVALU", "AFRLT", "ARRLT", "NRRLT", "NFRLT", | |
| "RRLTU", "ROUTE", "DOSETRT", "DOSEA", "DOSEU", "ADOSEDUR", | |
| "VOLUME", "VOLUMEU", "TRTRINT", "METABFL" | |
| ), | |
| silent = FALSE | |
| ) %>% | |
| # Derive METABFL column using PARAM metabolites | |
| create_metabfl(mapping$Metabolites) | |
| if (!is.null(mapping)) { | |
| names(mapping) <- gsub("select_", "", names(mapping)) | |
| } | |
| applied_filters <- session$userData$applied_filters | |
| preprocessed_adnca <- adnca_data %>% | |
| # Filter the data | |
| { | |
| if (!is.null(applied_filters) && length(applied_filters) > 0) { | |
| apply_filters(., applied_filters) | |
| } else { | |
| . | |
| } | |
| } %>% | |
| # Map columns to their standards | |
| { | |
| if (!is.null(mapping)) { | |
| apply_mapping( | |
| ., | |
| mapping = mapping, | |
| desired_order = c( | |
| "STUDYID", "USUBJID", "PARAM", "PCSPEC", "ATPTREF", | |
| "AVAL", "AVALU", "AFRLT", "ARRLT", "NRRLT", "NFRLT", | |
| "RRLTU", "ROUTE", "DOSETRT", "DOSEA", "DOSEU", "ADOSEDUR", | |
| "VOLUME", "VOLUMEU", "TRTRINT", "METABFL" | |
| ), | |
| silent = FALSE | |
| ) | |
| } else { | |
| . | |
| } | |
| } %>% | |
| # Derive METABFL column using PARAM metabolites | |
| { | |
| if (!is.null(mapping) && !is.null(mapping$Metabolites)) { | |
| create_metabfl(., mapping$Metabolites) | |
| } else { | |
| . | |
| } | |
| } |
| default_mapping <- list( | ||
| select_STUDYID = "STUDYID", | ||
| select_USUBJID = "USUBJID", | ||
| select_DOSEA = "DOSEA", | ||
| select_DOSEU = "DOSEU", | ||
| select_DOSETRT = "DOSETRT", | ||
| select_PARAM = "PARAM", | ||
| select_Metabolites = "Metab-DrugA", | ||
| select_ARRLT = "ARRLT", | ||
| select_NRRLT = "NRRLT", | ||
| select_AFRLT = "AFRLT", | ||
| select_NCAwXRS = c("NCA1XRS", "NCA2XRS"), | ||
| select_NFRLT = "NFRLT", | ||
| select_PCSPEC = "PCSPEC", | ||
| select_ROUTE = "ROUTE", | ||
| select_TRTRINT = "TRTRINT", | ||
| select_ADOSEDUR = "ADOSEDUR", | ||
| select_Grouping_Variables = c("TRT01A", "RACE", "SEX"), | ||
| select_RRLTU = "RRLTU", | ||
| select_VOLUME = "VOLUME", | ||
| select_VOLUMEU = "VOLUMEU", | ||
| select_AVAL = "AVAL", | ||
| select_AVALU = "AVALU", | ||
| select_ATPTREF = "ATPTREF" | ||
| ) |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default_mapping object is undocumented. Since it's used as a default parameter value in get_settings_code(), it should be documented to explain what these mappings represent and why these specific values were chosen as defaults. Consider documenting this with roxygen2 comments or at least adding inline comments explaining its purpose.
| .create_flags_for_profiles <- function(final_results, myres, flag_rules) { | ||
|
|
||
| # Add flaging columns in the pivoted results | ||
| applied_flags <- purrr::keep(flag_rules, function(x) x$is.checked) | ||
| flag_params <- names(flag_rules) | ||
| flag_thr <- sapply(flag_rules, FUN = function(x) x$threshold) | ||
| flag_rule_msgs <- paste0(flag_params, c(" < ", " > ", " > ", " < "), flag_thr) | ||
| flag_cols <- names(final_results)[formatters::var_labels(final_results) | ||
| %in% translate_terms(flag_params, "PPTESTCD", "PPTEST")] | ||
|
|
||
| if (length(flag_params) > 0) { | ||
| final_results <- final_results %>% | ||
| mutate( | ||
| flagged = case_when( | ||
| rowSums(is.na(select(., any_of(flag_cols)))) > 0 ~ "MISSING", | ||
| is.na(Exclude) ~ "ACCEPTED", | ||
| any(sapply( | ||
| flag_rule_msgs, function(msg) str_detect(Exclude, fixed(msg)) | ||
| )) ~ "FLAGGED", | ||
| TRUE ~ "ACCEPTED" | ||
| ) | ||
| ) | ||
| } | ||
| final_results | ||
| } |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The .create_flags_for_profiles() function does not handle the case when flag_rules is NULL. When flag_rules = NULL, the calls to purrr::keep(flag_rules, ...), names(flag_rules), and sapply(flag_rules, ...) will fail. Add a NULL check at the beginning of the function to return final_results unchanged when flag_rules is NULL, or handle the NULL case explicitly in each operation.
Issue
Closes #826
Description
This pull request introduces a new utility function for generating settings code from a template and makes a small bug fix in the ratio calculation logic.
Added a
default_mappinglist and a newget_settings_codefunction inR/get_session_code.Rto automate the generation of settings code from an RDS file, with an example invocation provided for demonstration.Updated
calculate_table_ratios_appinR/ratio_calculations.Rto ensureratio_tableis always treated as a data frame, preventing errors with NULL objects.Definition of Done
How to test
Use the function on a settings file and check that it works
Contributor checklist
Notes to reviewer
Better to review/merge first #789