-
Notifications
You must be signed in to change notification settings - Fork 9
Description
Describe the bug
The function getDrugIngredientCodes fails when the drug [ ¹⁸ F]AlF-NOTA-FAPI-04 (concept_id =1253507) is within the concepts part of the cdm, since the function tidyWords within filterIngredientConcepts throws an error because it contains the trimws base function, which requires the format of the string to be in UTF-8. This is regardless if options(encoding ="latin1") was used before executing the function
To Reproduce
Steps to reproduce the behavior:
- Create a CDM Object, add the particular concept_id and concept_name
- Run
getDrugIngredientCodes(
cdm = cdm,
name = c("aspirin", "diclofenac"),
nameStyle = "{concept_name}"
)
- See error, it will fail even if it's not a code you're looking for.
Expected behavior
Expected the function to return a codelist.
Screenshots
No screenshots, but here is the rlang::last_trace
<error/rlang_error>
Error in `dplyr::filter()`:
ℹ In argument: `tidyWords(.data$concept_name) %in% tidyWords(.env$name)`.
Caused by error in `sub()`:
! input string 29815 is invalid UTF-8
---
Backtrace:
▆
1. ├─CodelistGenerator::getDrugIngredientCodes(...)
2. │ └─CodelistGenerator:::filterIngredientConcepts(...)
3. │ ├─dplyr::filter(...)
4. │ └─dplyr:::filter.data.frame(ingredientConcepts, tidyWords(.data$concept_name) %in% tidyWords(.env$name))
5. │ └─dplyr:::filter_rows(.data, dots, by)
6. │ └─dplyr:::filter_eval(...)
7. │ ├─base::withCallingHandlers(...)
8. │ └─mask$eval_all_filter(dots, env_filter)
9. │ └─dplyr (local) eval()
10. ├─tidyWords(.data$concept_name) %in% tidyWords(.env$name)
11. ├─CodelistGenerator:::tidyWords(.data$concept_name)
12. │ └─base::trimws(words)
13. │ ├─base (local) mysub(...)
14. │ │ └─base::sub(re, "", x, perl = TRUE)
15. │ │ └─base::is.factor(x)
16. │ └─base (local) mysub(paste0("^", whitespace, "+"), x)
17. │ └─base::sub(re, "", x, perl = TRUE)
18. └─base::.handleSimpleError(...)
19. └─dplyr (local) h(simpleError(msg, call))
20. └─rlang::abort(message, class = error_class, parent = parent, call = error_call)
Desktop:
- OS: Ubuntu 22
- Browser: Chrome
Additional context
Here's the Athena link, both the code and the drug name are supposedly standard. I'm guessing if there's one of these now, there'll probably be more.