Skip to content

Cloud storage rollback in case of a metadata save failure doesn't work in the cloud #73

@gordonfarrell

Description

@gordonfarrell

Describe the bug
In both GCP and AWS now we've seen behavior where an eCR is saved successfully as a FHIR bundle to s3 bucket/blob storage, then the next step tries to saved metadata to the DB and fails. The rollback code that should go in to delete the associated bundle from storage fails to remove the bundle and we end up in a situation where we have a saved bundle but no associated metadata.

We have confirmed that it does TRY to delete the stored bundle but it fails for unclear reasons. Possible theory: race condition where it's trying to delete the bundle before it's "settled" properly in storage.

Impact
An ecr in this state will never appear in the eCR library, though it can be visited with a direct link. We could end up with a pile of phantom eCRs that exist only in storage and are invisible to app users.

To Reproduce
Steps to reproduce the behavior:

  1. Create a situation where we know metadata saving will fail. An easy one might be to edit your local DB to change a column name on ecr_data (a column name mismatch is one way we saw it while testing in the cloud)
  2. Process an eCR
  3. Observe viewer logs, storage, and DB state to see if it happened
  4. It's possible that this issue only exists in the cloud. If you can't reproduce locally try with a deployment.

Expected behavior
If an eCR successfully saves a bundle to storage but fails to save metadata to the DB, the saved bundle should be automatically deleted from storage.

Pair with Gordon if you need more context or help with repro

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions