- Create and start (or use existing one) standard interactive cluster, no Photon (Optional). Install
pyyamlandcoloramalibraries from PyPi - Create (or use existing one) 2X-Small Serverless warehouse, 1 Min 1 Max, Preview channel (Optional)
- Workspace -> Home -> Create -> Git folder
- Git repository URL: https://github.com/ysmx-github/phs_pilot.git -> Create Git Folder
- Open SQL notebook
/Workspace/Users/firstname.lastname@databricks.com/phs_pilot/src/depl/schema_sql - Connect to Serverless
- Run Cell 1
- Fill the widgets with the catalog name and target folder name (
dbr_ssa_clinicalanddbr_ddl_clinicalare used in this example) - Run all
- Open the volume: Catalog explorer -> dbr_ssa_clinical -> clinical_raw -> clinical_data_volume
- Download from the shared folder and unzip
data.zipandemr_ddl_clinical.zip. Manually upload foldersdataandemr_ddl_clinicalto the volume - Open notebook
/Workspace/Users/firstname.lastname@databricks.com/phs_pilot/src/depl/wf1_create, connect to cluster or Serverless, Run all - Open notebook
/Workspace/Users/firstname.lastname@databricks.com/phs_pilot/src/depl/wf2_create, connect to cluster or Serverless, Run all - Open notebook
/Workspace/Users/firstname.lastname@databricks.com/phs_pilot/src/depl/wf3_create, connect to cluster or Serverless, Run all - Open YAML file
/Workspace/Users/firstname.lastname@databricks.com/phs_pilot/src/wf_common/config.yaml, edit db_catalog and volume parameters as needed - Open Workflows
- Run
phs_wf1workflow, review workflow and results - Run
phs_wf2workflow, review workflow and results - Run
phs_wf3workflow, review workflow and results - Open
/Workspace/Users/firstname.lastname@databricks.com/phs_pilot/src/wf3/wf3_dlt_test.sql - Select all
- Copy
- Open SQL Editor -> New query
- Paste
- Select catalog
dbr_ssa_clinicaland schemaclinical_bronze - Run CDC tests on the
wf3_dltpipeline
ysmx-github/phs_pilot
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|