Capr Issue #99: Potential improvements to Capr#105
Capr Issue #99: Potential improvements to Capr#105guuswilmink wants to merge 155 commits intoOHDSI:developfrom
Conversation
Capr v2.0.6 release candidate.
update to v2.0.7 CirceR as remotes
Develop v2.0.8
… to countOccurences
copy files from PHEMS repo && updates to code
gitignore .Rprofile
Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 2 to 4.1.7. - [Release notes](https://github.com/actions/download-artifact/releases) - [Commits](actions/download-artifact@v2...v4.1.7) --- updated-dependencies: - dependency-name: actions/download-artifact dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
mdlavallee92
left a comment
There was a problem hiding this comment.
Hi @guuswilmink! Many many many apologies for being entirely too late on this review. This is a really nice add, thank you! Would you be interested in other maintenance?
My main two hold ups are the on the signatures of the functions and consolidating export files.
- Change them to camelCase as that is part of the HADES style gudie...ie cdmDatabaseSchema, vocabDatabaseSchema. Make v conceptIds. v is way to generic.
- consolidate the isStandard functions to 1 file and dont export the links function. That is a stable internal. See example of the same idea here.
I need a bit more examples of how to use the isStandard functions. I believe their utility but not totally seeing its use in the day to day of the package. Do you mind doing a vignette or pointing me to some examples of how you are using this? I love the countOccurrences function. Think that is a handy tool!
| #' ) | ||
| #' | ||
| #' @export | ||
| countOccurrences <- function(v, tables, links, db_connection, cdm_schema, vocab_schema, save_path = NULL) { |
There was a problem hiding this comment.
Some notes on the signature...
v is to generic of a signature...just use conceptIds as the input name if that is what you want the user to input.
signatures should be in camelCase ie conceptIds, dbConnection, cdmDatabaseSchema. this is part of the HADES style guide
Does the links table ever change? If not just make this a non-exported function called within the function.
| for (table in tables) { | ||
| concept_id_field <- links[[table]][1] | ||
|
|
||
| # Combined SQL query for direct and descendant counts |
There was a problem hiding this comment.
prefer to have the sql sourced as an inst file instead of within the code. it ends up being a bit neater. No change required
| #' db_connection = db_connection, cdm_schema = "public", vocab_schema = "vocabulary" | ||
| #' ) | ||
| #' | ||
| #' @export |
There was a problem hiding this comment.
Nice I like this function a lot!
| #' @importFrom readr read_csv write_csv | ||
| #' @importFrom dplyr mutate across filter select inner_join | ||
| #' @importFrom DatabaseConnector connect disconnect querySql | ||
| #' @export |
There was a problem hiding this comment.
Same thing here change to camelCase.
| } | ||
|
|
||
| for (table_path in tables) { | ||
| table_name <- basename(table_path) |
There was a problem hiding this comment.
No change required. Suggestion is to leverage internal function. it makes the code easier to read and reference. For example, the for loop in L56 seems to be a constant pattern. Functionalize this as like prepAndReadTable function that you leverage here.
| #' @importFrom readr read_csv write_csv | ||
| #' @importFrom dplyr mutate filter rename bind_rows | ||
| #' @importFrom DatabaseConnector querySql | ||
| #' @importFrom SqlRender render translate |
There was a problem hiding this comment.
For all of these isStandard examples, show me a couple of use cases. I believe you that they are needed but I havent encountered them often
|
|
||
| for (table in tables) { | ||
| # Read concept table from SQL database | ||
| concept_table_query <- SqlRender::render(sprintf( |
There was a problem hiding this comment.
just curious...why sprintf? I usually use glue. Despite adding a dependency i think its a real good package for interpreting string literals
| #' print(concept_field) | ||
| #' | ||
| #' @export | ||
| links <- list( |
There was a problem hiding this comment.
I think this should be a non-exported function. It seems stable since its pegged to the CDM schematic.
| @@ -0,0 +1,127 @@ | |||
| source("R/isStandard.R") | |||
There was a problem hiding this comment.
consolidate into 1 isStandard file all of these functions and test. ow looks good
| person("Martin", "Lavallee", , "martin.lavallee@boehringer-ingelheim.com", role = c("aut", "cre")), | ||
| person("Adam", "Black", , "black@ohdsi.org", role = c("aut")) | ||
| person("Adam", "Black", , "black@ohdsi.org", role = c("aut")), | ||
| person("Guus", "Wilmink", , "guus@thehyve.nl", role = c("aut")) |
There was a problem hiding this comment.
would you like to do more maintenance on Capr? As you can see the package is escaping me given how slow I was to respond to this PR
…ub/workflows/actions/download-artifact-4.1.7 Bump actions/download-artifact from 2 to 4.1.7 in /.github/workflows
This PR is linked to #99.
Suggested functions:
isStandard.R: Given a path to CSV files with at least fields "sourceCode" and "conceptId", along with a database connection with a CDM and vocabulary schema, return a tibble of non-standard concepts. Allows for export of results for all concepts included (including standard). This function can help in quickly identifying non-standard concepts from tables of concept ids.
isStandardDB.R: Checks whether concepts that exist in a database are standard/non-standard and returns the non-standard concepts. The full table of standard and non-standard concepts can be saved as well. This is probably the most useful out of the 3 isStandard functions.
isStandardCS.R: Similar return as isStandard.R and isStandardDB, but performs the check given a Capr ConceptSet class object rather than CSV files. Does not capture source value (as these are not relevant for concept sets).
countOccurrences.R: Given a vector of concept IDs and a connection to a CDM instance, count the number of occurrences of: 1) persons with concept 'x'; 2) records with concept 'x'; 3) persons with concept 'x' or descendants of 'x'; 4) records with concept 'x' or descendants of 'x'
To demonstrate and further clarify this function I have attached a PDF with example usage. Is this already possible within Capr?
@mdlavallee92 I have incorporated your feedback provided in issue #99; please let me know if this is to your expectations.