Conversation
| db_upgrade <- function(db_path){ | ||
| stopifnot(file.exists(db_path)) | ||
|
|
||
| current_db_version <- db_get_version(db_path) | ||
| if (identical(current_db_version, db_version)) | ||
| return(paste("Upgraded to version", db_version)) | ||
|
|
||
| switch( | ||
| current_db_version, | ||
| "1.1" = db_temp_connect(db_path, { | ||
| create_delete_log_trigger(con) | ||
| DBI::dbWriteTable(con, "db_version", data.frame(version = "1.2"), overwrite = TRUE) | ||
| }), | ||
| stop("No upgrade available for version ", current_db_version) | ||
| ) | ||
| db_upgrade(db_path) |
There was a problem hiding this comment.
I like this approach to the database upgrading issue. By using recursion, we can essentially just add a step to the upgrade process. Once the database is fully upgraded to the expected value based on the version of ClinSight itself, it will return a success.
I also added a version check in db_update(). Don't know if there are other places versioning should be checked.
There was a problem hiding this comment.
I am fine with this approach. I guess this needs to be rewritten to a loop if there are multiple database versions but that can be done once needed.
You could also think of asking user confirmation of the update if one is found, something like this: utils::menu(c("Yes", "No"), title = "Update available. Do you really want to proceed?")
Version is also checked in run_app()
There was a problem hiding this comment.
I guess this needs to be rewritten to a loop if there are multiple database versions but that can be done once needed.
Actually this approach doesn't need a loop because it is using recursion (calls itself at the end dp_upgrade(dp_path)). Say the release after this is version 1.3. We would simply add "1.2" to the switch() and the function would iterate based on the version. This allows us to focus on the changes needed to go from version 1.2 to 1.3 and ignore 1.1 to 1.3.
There was a problem hiding this comment.
You could also think of asking user confirmation of the update if one is found, something like this:
utils::menu(c("Yes", "No"), title = "Update available. Do you really want to proceed?")
Sure. I think that makes sense.
There was a problem hiding this comment.
Version is also checked in
run_app()
Yes, I added checks to db_update() because it is exported and can be ran outside of run_app().
There was a problem hiding this comment.
Was updated by running db_upgrade("tests/testthat/fixtures/testdb.sqlite")
|
@LDSamson I need to add some more to this PR (tests and internal docs), but I wanted to get your initial impression of this approach. |
LDSamson
left a comment
There was a problem hiding this comment.
Few initial comments, see below.
| db_upgrade <- function(db_path){ | ||
| stopifnot(file.exists(db_path)) | ||
|
|
||
| current_db_version <- db_get_version(db_path) | ||
| if (identical(current_db_version, db_version)) | ||
| return(paste("Upgraded to version", db_version)) | ||
|
|
||
| switch( | ||
| current_db_version, | ||
| "1.1" = db_temp_connect(db_path, { | ||
| create_delete_log_trigger(con) | ||
| DBI::dbWriteTable(con, "db_version", data.frame(version = "1.2"), overwrite = TRUE) | ||
| }), | ||
| stop("No upgrade available for version ", current_db_version) | ||
| ) | ||
| db_upgrade(db_path) |
There was a problem hiding this comment.
I am fine with this approach. I guess this needs to be rewritten to a loop if there are multiple database versions but that can be done once needed.
You could also think of asking user confirmation of the update if one is found, something like this: utils::menu(c("Yes", "No"), title = "Update available. Do you really want to proceed?")
Version is also checked in run_app()
| #' @return A data frame containing only the rows to remove from the review data. | ||
| #' | ||
| #' @keywords internal | ||
| delete_review_data <- function( |
There was a problem hiding this comment.
Function name is a bit confusing because it is not deleting review data but rather returning the rows to be deleted.
There was a problem hiding this comment.
I agree. I was mimicking update_review_data() below which also doesn't actually do what it's called and simply just returns the data to update.
| cat("Check for deleted rows\n") | ||
| deleted_review_data <- delete_review_data( | ||
| review_df = review_data, | ||
| latest_review_data = data, | ||
| key_cols = key_cols | ||
| ) | ||
| cat("logging deleted review data to database...\n") | ||
| db_delete(con, deleted_review_data) |
There was a problem hiding this comment.
Do we need to hard-force a DB update/ this DB version or can we run this part on the condition that the DB is at least version 1.2?
There was a problem hiding this comment.
I think things will get messy fast if we allow desynching between the DB version and the ClinSight version. Also, we already have a hard version check in run_app().
There was a problem hiding this comment.
Desynching should be removed as soon as it becomes messy of course. Not more than a minor version desynch would not be a problem I think? We can remove desynching at the next DB update.
It would be easier to update clinsight in running studies. With that desynch the dev version is still a drop-in replacement for the current v0.3.0 version, no breaking changes.
| deleted_review_data <- delete_review_data( | ||
| review_df = review_data, | ||
| latest_review_data = data, | ||
| key_cols = key_cols | ||
| ) | ||
| cat("logging deleted review data to database...\n") | ||
| db_delete(con, deleted_review_data) |
There was a problem hiding this comment.
Is this safe to run automatically with each update?
What if this goes wrong and the wrong lines are selected for deletion? Can this be restored somehow easily? Or is that difficult?
For example: what if you use an API to pull data and this data pull malfunctions and returns either zero rows, or not enough rows and you run db_update, do you loose all your review data then?
There was a problem hiding this comment.
The deleted rows are located in the logging table. The situation you outlined above seems outside of this process though. I would hope that a check would be in place to verify data before pushing to ClinSight. And even if a push happened, I would hope someone would be creating DB backups for rollback if needed.
There was a problem hiding this comment.
More specifically on the restoration side, the old record is stored in the logging table. Those records could be retrieved with their review status if needed.
Above statement is inaccurate. The key columns are not being stored in the logging table, only the ID. Their values are truly lost in this case.
There was a problem hiding this comment.
The db_update() function is meant to run automatically and its behaviour should not be destructive, I think. At least any potential destructive behaviour should be optional.
|
@LDSamson on further reflection I think I understand the concern. If a record gets removed and then later reappears, the review data should see these as the same record and not two different ones. Is this a good summary? |
Yes that would be ideal indeed. It should not be a destructive action to the DB. |
|
Note, opened #270 so we have an issue to track for this work. |
Few Notes
Okay, workflow here is run
db_upgrade(db_path)and thendb_update()should work as expected.The function
db_update()won't run if the database version between the SQLite file and the R package don't align.New databases will be initialized at the correct version.
Summary of Changes
If a record/item/row is no longer in the merged data set, it will be deleted from
all_review_data. A trigger has been added to log these deleted records to theall_review_data_logtable with a DML type of 'DELETE'.