Identification of Patterns in Stroke Care Transitions using OHDSI Pharmetrics+ data

Stroke survivors receive rehabilitation services (physical therapy, occupational therapy, and/or speech-language therapy) in different locations at different rates in the period following their stroke. Differences between levels of care location has been identified as a potential avenue for understanding the divergence in the quality of stroke post-acute treatment outcomes.

Our hypothesis is that it is possible to use OHDSI data to trace the various paths that stroke survivors take through post-incident care so that the efficacy and efficiency of the different care transitions can be evaluated. We were able to create a proof-of-concept process on a limited cohort (stroke patients with aphasia, with patient IDs attached to their post-acute speech therapy treatments and treatment locations) to show it is possible to trace the different paths through treatment post-stroke. To replicate our process, please follow the steps below. Note that for true reproducability it is necessary to access the AWS for the Northeastern OHDSI Center.

OHDSI + Setup

In order to access the OHDSI database, a new user must first read the OHDSI Lab User Guide and complete several steps (1-12). Our tutorial briefly outlines these first steps but primarily acts as supplemental instructions for Step 13 in the OHDSI User Guide to set up a workspace environment with Python instead of R.

Step 0: Setup environment

These steps work in relation to one having access to OHDSI's Amazon Workspace.

Follow each step in the tutorial. This repo is a quick tutorial of how to use python for the OHDSI database. Steps include:

AWS Setup
Miniconda
Python environment
Git
Make
Config of Redshift credentials
Redshift database connection
Pandas to process tables, read, and write

Refer to config_template.ini and how_to_use_templates.md for additional help.

Step 1: Create intermediate tables

We will create various intermediate tables in your own schema. This is done because directly using tables in 'omop_cdm_53_pmtx_202203' schema, which is the original data. Every time that an analysis is performed is extremely slow, since the number of data points in those tables are on a scale of billions. We will use the stroke cohort definition created by Casey Tilton as an index table to filter out any relevant (stroke diagnosis) person_IDs from the omop schema into a table written into your work schema. As a result, you will be working on tables that are on a scale of a million data points in maximum. Run the following command on your Anaconda PowerShell Prompt. Make sure you are at your directory which cloned this repo:

 make create_tables

This command is going to run 11 .py files in a correct order. Each of the .py files correspond to each intermediate table that is created. Notice that some of the intermediate tables will require other intermediate tables to be created first, so the order of running these .py files is very important. Now open DBeaver, and check your work space. You must have the following tables in your schema now:

Note that your Object ID won't match these exactly. The important part is if the Row Count Estimate is the same, and make sure none of them are empty. You can check if the tables are empty by double clicking the table name and check the 'Data' tab.

If you see any error, please refer to the 'Makefile' and try running each .py file individually. For example, if you had an error message while 'stroke_ancestor' table is being created, try running the following command:

make stroke_ancestor

This is going to run the 'stroke_ancestor.py' file only, and you can debug by opening the 'stroke_ancestor.py' file and reading through the code.

Step 2: Plotting

The following make commands can be used in your Anaconda PowerShell Prompt to plot the figures we have in this repo:

make plot_stroke_desc_concept

make plot_has_aphasia

make plot_stroke_type_aphasia_TRUE

make plot_stroke_type_aphasia_FALSE

make plot_first_discharge

make plot_speech_therapy_aphasia

'make plot_first_discharge' will run 3 .py files, while other commands will only run one .py file at a time.

Confirm you produce the same plots as we did in our EDA.

Step 3: Analysis of discharge path

The actual analysis of discharge paths of stroke patients are done by following command:

 make analysis_visit_oc_5_discharge

This analysis is done very simply to provide a frame of what to work on next regarding discharge paths of stroke patients. Any user who wants to do a further analysis regarding discharge paths of stroke patients may use this result as a beginning point.

Lessons Learned

Did your project objectives change based on what your learned from the data or stakeholder?

We treated this project as an AGILE endeavor; a waterfall approach would likely have lead to failure at worst, extreme frustration at best. Our objectives changed both as we explored the data and as we met with the stakeholder, Rob Cavanaugh, each week to share our progress. Rob provided insight into the potential ways to access the data we needed for analysis of care pathways. He provided stroke and speech language codes and suggestions for how to determine if a visit could fall under PT vs OT. Our project team had regular stand-up meetings to discuss blockers, report on progress, and plan next steps.
Were data-access or data processing challenges harder than you anticipated?

As noted in our docs/README.md, there was a learning curve with the complexity of the OHDSI database since the data is drawn from real-world interactions and the schema connections can be hard to navigate without a medical knowledge. A medical condition can have various codes associated with it, and these codes can also change throughout a patient's care timeline. Gaps in data availability and comprehensiveness were an additional challenge. As our team tested tables and fields in queries, we often ran into dead ends because of missing data and had to search for alternate paths to acquire necessary data points. That the database uses OMOP, or the Observational Medical Outcomes Partnership, a common data model (CDM) to standardize healthcare data, was a blessing. Not every relational databases has clean, consistent linking variables without tidying first.

Next Steps

The next step would be to create cohorts for all potential paths, e.g., aphasia versus no aphasia diagnosis, different types of treatment, and different locations. We would recommend:

Identifying the best markers/concept_IDs for physical and occupational therapy
Matching those therapies to patient_IDs from the master stroke incidence file
Creating a table with location by visit_start_date, visit_end_date, and discharge_to location for each patient_ID (note that by definition of the initial cohort, the first location is always emergency room and/or inpatient hospital stay)
Appending location by date to each therapy to each patient_ID

Analysis can be performed on the resulting table to find:

Frequency of therapies, overall and by location
Duration (in days) of therapies, overall and by location

Given that Northeastern University’s ODHSI database is incomplete, there is additional data that would be helpful to simplify the process, though it is possible to complete the analysis without this data:

Comprehensive provider_specialty data
Care_site_type and concept_id in the care_site table

Attributions

Resources

OHDSI

OMOP

Name		Name	Last commit message	Last commit date
Latest commit History 221 Commits
docs		docs
figs		figs
src		src
tutorial		tutorial
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Identification of Patterns in Stroke Care Transitions using OHDSI Pharmetrics+ data

OHDSI + Setup

Step 0: Setup environment

Step 1: Create intermediate tables

Step 2: Plotting

Step 3: Analysis of discharge path

Lessons Learned

Next Steps

Attributions

Rob Cavanaugh | Stakeholder

Casey Tilton | Data Science Capstone Student

Philip Bogden | DS5110 Professor

Resources

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

ds5110/ohdsi-fall24

Folders and files

Latest commit

History

Repository files navigation

Identification of Patterns in Stroke Care Transitions using OHDSI Pharmetrics+ data

OHDSI + Setup

Step 0: Setup environment

Step 1: Create intermediate tables

Step 2: Plotting

Step 3: Analysis of discharge path

Lessons Learned

Next Steps

Attributions

Rob Cavanaugh | Stakeholder

Casey Tilton | Data Science Capstone Student

Philip Bogden | DS5110 Professor

Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages