Skip to content

feat: Style Guide & Scripts for Extracting Files from Google Drive#19

Closed
avi9664 wants to merge 84 commits intomainfrom
avi-external-etl
Closed

feat: Style Guide & Scripts for Extracting Files from Google Drive#19
avi9664 wants to merge 84 commits intomainfrom
avi-external-etl

Conversation

@avi9664
Copy link
Copy Markdown
Collaborator

@avi9664 avi9664 commented Jan 16, 2026

📄 Description

  • Extracting from GDrive - Added script templates & example Jupyter notebooks using infrastructure datasets to the utils folder.
    • In order for the examples to work, make sure your service account is shared with this folder.
    • Right now, the extraction scripts only work with .csv, .zip, and .geojson files. You can edit the gdrive_to_pandas.py file to support other file types; you would just need to know the MIME type of the file.
    • The error handling isn't quite there yet. I still need to write test cases.
  • Style Guide - Added STYLE_GUIDE.md under src/ca_biositing/pipeline/docs. Feel free to restructure & edit as you please.

✅ Checklist

  • I ran pre-commit run --all-files and all checks pass
  • Tests added/updated where needed
  • Docs added/updated if applicable
  • I have linked the issue this PR closes (if any)

💡 Type of change

Type Checked?
🐞 Bug fix [ ]
✨ New feature [x]
📝 Documentation [x]
♻️ Refactor [ ]
🛠️ Build/CI [ ]
Other (explain) [ ]

🧪 How to test

  • The example notebook is utils\external_etl_notebook.ipynb. Try running it to see if it works on your local machine.
  • Optionally, you can also create a new extract script using any of the templates and run it in external_etl_notebook.ipynb.

📝 Notes to reviewers

  • I still haven't changed the transform and load functions yet in external_etl_notebook.ipynb; they're all commented out in the last cell.

petercarbsmith and others added 30 commits December 7, 2025 19:53
…nges in the database. think there was an issue with the LinkML --> alembic
mglbleta and others added 25 commits December 22, 2025 10:29
…nges in the database. think there was an issue with the LinkML --> alembic
…iositing.yaml to include the infrastructure models
…ade new ETL notebook and modified some of the gsheet extraction notebook
@avi9664 avi9664 closed this Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants