Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ export(create_table)
export(db_create)
export(db_get_query)
export(db_update)
export(db_upgrade)
export(get_appdata)
export(get_base_value)
export(get_db_connection)
Expand Down
113 changes: 111 additions & 2 deletions R/fct_SQLite.R
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ db_add_primary_key <- function(con, name, value, keys = NULL) {

#' Add Logging Table
#'
#' Both creates the logging table and the trigger to update it for
#' Creates the logging table and the triggers on updates and deletions for
#' all_review_data.
#'
#' @param con A DBI Connection to the SQLite DB
Expand Down Expand Up @@ -218,6 +218,78 @@ db_add_log <- function(con, key_cols = c("id", key_columns)) {
"END"
))
DBI::dbClearResult(rs)
create_delete_log_trigger(con)
}

#' Create Delete Logging Trigger
#'
#' Logs record deletions on all_review_data into all_review_data_log.
#'
#' @param con A DBI Connection to the SQLite DB
#'
#' @keywords internal
create_delete_log_trigger <- function(con) {
rs <- DBI::dbSendStatement(con, paste(
"CREATE TRIGGER all_review_data_delete_log_trigger",
"AFTER DELETE ON all_review_data FOR EACH ROW",
"BEGIN",
"INSERT INTO all_review_data_log (",
"review_id, edit_date_time, reviewed, comment, reviewer, timestamp, status, dml_type",
")",
"VALUES(",
"OLD.id,",
"OLD.edit_date_time,",
"OLD.reviewed,",
"OLD.comment,",
"OLD.reviewer,",
"OLD.timestamp,",
"OLD.status,",
"'DELETE'",
");",
"END"
))
DBI::dbClearResult(rs)
}

#' Upgrade app database
#'
#' Helper function to upgrade the database when migrating to a newer version of
#' ClinSight that impacts the database.
#'
#' @param db_path Character vector. Path to the database
#'
#' @return A character vector indicating the version the DB was upgraded to. Run
#' for side effects on the database.
#'
#' @export
db_upgrade <- function(db_path){
stopifnot(file.exists(db_path))

current_db_version <- db_get_version(db_path)
if (identical(current_db_version, db_version))
return(paste("Upgraded to version", db_version))

switch(
current_db_version,
"1.1" = db_temp_connect(db_path, {
if (!identical(db_upgrade_menu("1.2"), 1L))
return(paste("Abandoning DB upgrading. DB version is", current_db_version))
create_delete_log_trigger(con)
DBI::dbWriteTable(con, "db_version", data.frame(version = "1.2"), overwrite = TRUE)
}),
stop("No upgrade available for version ", current_db_version)
)
db_upgrade(db_path)
Comment on lines 265 to 282
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach to the database upgrading issue. By using recursion, we can essentially just add a step to the upgrade process. Once the database is fully upgraded to the expected value based on the version of ClinSight itself, it will return a success.

I also added a version check in db_update(). Don't know if there are other places versioning should be checked.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with this approach. I guess this needs to be rewritten to a loop if there are multiple database versions but that can be done once needed.

You could also think of asking user confirmation of the update if one is found, something like this: utils::menu(c("Yes", "No"), title = "Update available. Do you really want to proceed?")

Version is also checked in run_app()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this needs to be rewritten to a loop if there are multiple database versions but that can be done once needed.

Actually this approach doesn't need a loop because it is using recursion (calls itself at the end dp_upgrade(dp_path)). Say the release after this is version 1.3. We would simply add "1.2" to the switch() and the function would iterate based on the version. This allows us to focus on the changes needed to go from version 1.2 to 1.3 and ignore 1.1 to 1.3.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also think of asking user confirmation of the update if one is found, something like this: utils::menu(c("Yes", "No"), title = "Update available. Do you really want to proceed?")

Sure. I think that makes sense.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version is also checked in run_app()

Yes, I added checks to db_update() because it is exported and can be ran outside of run_app().

}

db_upgrade_menu <- function(upgrade_version) {
utils::menu(
c("Yes", "No"),
title = sprintf(
"Attempting to update to v%s. Do you want to proceed?",
upgrade_version
)
)
}

#' Update app database
Expand All @@ -242,6 +314,7 @@ db_update <- function(
edit_time_var = "edit_date_time"
){
stopifnot(file.exists(db_path))
stopifnot(identical(db_version, db_get_version(db_path)))
con <- get_db_connection(db_path)
data_synch_time <- attr(data, "synch_time") %||% ""

Expand All @@ -260,6 +333,14 @@ db_update <- function(
# Continue in the case data_synch_time is missing and if data_synch_time is
# more recent than db_synch_time
review_data <- DBI::dbGetQuery(con, "SELECT * FROM all_review_data")
cat("Check for deleted rows\n")
deleted_review_data <- delete_review_data(
review_df = review_data,
latest_review_data = data,
key_cols = key_cols
)
cat("logging deleted review data to database...\n")
db_delete(con, deleted_review_data)
Comment on lines +336 to +343
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to hard-force a DB update/ this DB version or can we run this part on the condition that the DB is at least version 1.2?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think things will get messy fast if we allow desynching between the DB version and the ClinSight version. Also, we already have a hard version check in run_app().

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Desynching should be removed as soon as it becomes messy of course. Not more than a minor version desynch would not be a problem I think? We can remove desynching at the next DB update.

It would be easier to update clinsight in running studies. With that desynch the dev version is still a drop-in replacement for the current v0.3.0 version, no breaking changes.

Comment on lines +337 to +343
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this safe to run automatically with each update?

What if this goes wrong and the wrong lines are selected for deletion? Can this be restored somehow easily? Or is that difficult?

For example: what if you use an API to pull data and this data pull malfunctions and returns either zero rows, or not enough rows and you run db_update, do you loose all your review data then?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deleted rows are located in the logging table. The situation you outlined above seems outside of this process though. I would hope that a check would be in place to verify data before pushing to ClinSight. And even if a push happened, I would hope someone would be creating DB backups for rollback if needed.

Copy link
Collaborator Author

@jthompson-arcus jthompson-arcus Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More specifically on the restoration side, the old record is stored in the logging table. Those records could be retrieved with their review status if needed.

Above statement is inaccurate. The key columns are not being stored in the logging table, only the ID. Their values are truly lost in this case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The db_update() function is meant to run automatically and its behaviour should not be destructive, I think. At least any potential destructive behaviour should be optional.

cat("Start adding new rows to database\n")
updated_review_data <- update_review_data(
review_df = review_data,
Expand All @@ -279,6 +360,34 @@ db_update <- function(
cat("Finished updating review data\n")
}

#' DELETE FROM all_review_data
#'
#' Performs a DELETE FROM on all_review_data based on a data set containing the
#' records to delete.
#'
#' @param con A DBI Connection to the SQLite DB
#' @param data A data frame containing the records to DELETE FROM
#' all_review_data
#' @param key_cols A character vectory specifying which columns define a unique
#' index for a row. Defaults to "id"
#'
#' @return invisible returns TRUE. Is run for its side effects on the DB.
#'
#' @keywords internal
db_delete <- function(con, data, key_cols = "id") {
dplyr::copy_to(con, data, "row_deletes")
rs <- DBI::dbSendStatement(con, paste(
"DELETE FROM",
"all_review_data",
"WHERE EXISTS (",
"SELECT 1",
"FROM row_deletes",
"WHERE", paste(sprintf("row_deletes.%1$s = all_review_data.%1$s", key_cols), collapse = " AND "),
")"
))
DBI::dbClearResult(rs)
}

#' UPSERT to all_review_data
#'
#' Performs an UPSERT on all_review_data. New records will be appended to the
Expand All @@ -290,7 +399,7 @@ db_update <- function(
#' @param key_cols A character vector specifying which columns define a unique
#' index for a row. Defaults to `ClinSight` [key_columns()].
#'
#' @return invisibly returns TRUE. Is run for it's side effects on the DB.
#' @return invisibly returns TRUE. Is run for its side effects on the DB.
#'
#' @keywords internal
db_upsert <- function(con, data, key_cols = key_columns) {
Expand Down
48 changes: 34 additions & 14 deletions R/mod_review_data_fct_helpers.R
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,40 @@ get_review_data <- function(
all_review_data
}

#' Delete review data
#'
#' Identifies the records no longer in the dataset to be removed from
#' all_review_data.
#'
#' @param review_df Data frame containing old review data that need to be
#' updated.
#' @param latest_review_data Data frame containing latest review data.
#' @param key_cols A character vector containing the common key variables.
#' Defaults to `ClinSight` [key_columns()].
#'
#' @return A data frame containing only the rows to remove from the review data.
#'
#' @keywords internal
delete_review_data <- function(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function name is a bit confusing because it is not deleting review data but rather returning the rows to be deleted.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I was mimicking update_review_data() below which also doesn't actually do what it's called and simply just returns the data to update.

review_df,
latest_review_data,
key_cols = key_columns
){
stopifnot(is.data.frame(latest_review_data), nrow(latest_review_data) > 0 )
stopifnot(is.data.frame(review_df), nrow(review_df) > 0 )
stopifnot(is.character(key_cols))

deleted_data <- dplyr::anti_join(review_df, latest_review_data, by = key_cols)
n_deleted <- nrow(deleted_data)
if(n_deleted != 0){
warning(n_deleted, " items were not found in the updated dataset")
cat("Missing items in the dataset:\n")
print(deleted_data)
}

deleted_data
}

#' Update review data
#'
#'
Expand Down Expand Up @@ -67,20 +101,6 @@ update_review_data <- function(
stopifnot(is.character(key_cols))
stopifnot(is.character(edit_time_var))

deleted_data <- dplyr::anti_join(review_df, latest_review_data, by = key_cols)
n_deleted <- nrow(deleted_data)
if(n_deleted != 0){
warning(n_deleted, " items were not found in the updated dataset")
cat("Missing items in the dataset:\n")
print(deleted_data)
}

# If data is reviewed and later in an update deleted, then it is okay if the
# deletion does not show up in the updated data set, and the deleted data point shows
# up as reviewed. One reason why this is okay is because missing data is checked
# by the data managers.
# a more practical reason is that missing data does not show up in the application.

# when do we have the same edit_date_time but different timestamps?
# For example:
# - we have a new data point. It is not yet reviewed (status new or updated).
Expand Down
Binary file modified R/sysdata.rda
Binary file not shown.
2 changes: 1 addition & 1 deletion data-raw/internal_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ required_meta_cols <- c(
"item_type"
)

db_version <- "1.1"
db_version <- "1.2"

# Used in get_form_level_data(). Set a default if ClinSight needs the columns
# to function properly.
Expand Down
1 change: 1 addition & 0 deletions inst/_pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ reference:
- contents:
- get_db_connection
- db_create
- db_upgrade
- db_update
- db_get_review
- db_get_query
Expand Down
15 changes: 15 additions & 0 deletions man/create_delete_log_trigger.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/db_add_log.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

25 changes: 25 additions & 0 deletions man/db_delete.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 19 additions & 0 deletions man/db_upgrade.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/db_upsert.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

25 changes: 25 additions & 0 deletions man/delete_review_data.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Binary file modified tests/testthat/fixtures/testdb.sqlite
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was updated by running db_upgrade("tests/testthat/fixtures/testdb.sqlite")

Binary file not shown.
Loading