From 39bd4edc681ba5508357bffe74f262106eeb8c80 Mon Sep 17 00:00:00 2001 From: mglbleta Date: Thu, 12 Mar 2026 17:48:17 -0700 Subject: [PATCH 1/4] minor edits to documentation in ca-biositing subseader of the mkdocs. unsure what the project_filebrowser.db file in my staged changes is but it will not allow me to not commit. feel free to review and discard edit. --- anaconda_projects/db/project_filebrowser.db | Bin 0 -> 32768 bytes docs/CODE_OF_CONDUCT.md | 2 +- docs/CONTRIBUTING.md | 2 +- docs/ERD_VIEW.md | 2 +- docs/README.md | 2 +- docs/architecture.md | 33 ++++++++++---------- docs/datamodels/README.md | 2 +- docs/deployment/README.md | 2 +- docs/notebook_setup.md | 2 +- docs/pipeline/README.md | 2 +- docs/resources/README.md | 2 +- docs/webservice/README.md | 2 +- pixi.lock | 21 ++----------- 13 files changed, 30 insertions(+), 44 deletions(-) create mode 100644 anaconda_projects/db/project_filebrowser.db diff --git a/anaconda_projects/db/project_filebrowser.db b/anaconda_projects/db/project_filebrowser.db new file mode 100644 index 0000000000000000000000000000000000000000..3fa3a4a02c0753b66dd6cf65930b505be899ade0 GIT binary patch literal 32768 zcmeI)PjAv-9Ki9GZT#Ch^u&qe$&zJ|Et)Z>@nB-b8W)iXE5^kvO$QIA2!%mGubW3t zdl9|}Ux^oAgh!9Igf2kl;=zNd--ch>=XvOp=lS%PCQn*8s+Jv3idJvXwLP(?+)`9k zc`SsYD64)X{fPS!^CRuYvLE55h6gw7SCu>C&zkm4Sy{VOGg{{B+NJho{kzuI+Bcj< z69NbzfB*srAbc%c7q{byY%ZsM8hQ3z(-g1uTIso76O}_#RO;pOj!37Pwr2~|cw+{I-Un9>u1c=mmD55~w(L>Iv!pwA z20gbcUGJ*5r^YkAUN*(n)=bBq{cbopS6nr}J(>MNI`yh{@KUcGixM|9wNuCD2TNeoj1! z#k1Az-{r(&XS)j(Oa9FySlcU_A8O`@;(yo>KmY**5I_I{1Q0*~0R#|0AR+?mYHYLh zVE^I%>Am~KU0Dns>a+fj$i-zE2q1s}0tg_000IagfB*srOa(Gao2l9LfAIX@f9anM z0R#|0009ILKmY**5I_I{1R^ZJ`ai-Km#HCu00IagfB*srAb|BuSmWF810fB*srAb + - **Resource**: Core biomass resource definitions - **ResourceClass**, **ResourceSubclass**: Hierarchical resource classification - **ResourceAvailability**: Seasonal and quantitative availability data @@ -509,6 +506,10 @@ Environments: ## Deployment & Operations + + ### Container Orchestration - **Development**: Docker Compose for local services diff --git a/docs/datamodels/README.md b/docs/datamodels/README.md index b62101ad..f3c7aa23 120000 --- a/docs/datamodels/README.md +++ b/docs/datamodels/README.md @@ -1 +1 @@ -../../src/ca_biositing/datamodels/README.md \ No newline at end of file +../../src/ca_biositing/datamodels/README.md diff --git a/docs/deployment/README.md b/docs/deployment/README.md index d5c46ad4..5159bcfe 120000 --- a/docs/deployment/README.md +++ b/docs/deployment/README.md @@ -1 +1 @@ -../../deployment/README.md \ No newline at end of file +../../deployment/README.md diff --git a/docs/notebook_setup.md b/docs/notebook_setup.md index d65d6b2d..ab2f8936 100644 --- a/docs/notebook_setup.md +++ b/docs/notebook_setup.md @@ -1,6 +1,6 @@ # Notebook Setup Guide for **ca-biositing** -**Purpose** -- Set up Jupyter notebooks with correct imports for the PEP 420 +**Purpose**: Set up Jupyter notebooks with correct imports for the PEP 420 namespace packages used in this repository. --- diff --git a/docs/pipeline/README.md b/docs/pipeline/README.md index 82a8c831..544d50b1 120000 --- a/docs/pipeline/README.md +++ b/docs/pipeline/README.md @@ -1 +1 @@ -../../src/ca_biositing/pipeline/README.md \ No newline at end of file +../../src/ca_biositing/pipeline/README.md diff --git a/docs/resources/README.md b/docs/resources/README.md index 9f21ec35..ea34ca59 120000 --- a/docs/resources/README.md +++ b/docs/resources/README.md @@ -1 +1 @@ -../../resources/README.md \ No newline at end of file +../../resources/README.md diff --git a/docs/webservice/README.md b/docs/webservice/README.md index 749a635e..24ea2883 120000 --- a/docs/webservice/README.md +++ b/docs/webservice/README.md @@ -1 +1 @@ -../../src/ca_biositing/webservice/README.md \ No newline at end of file +../../src/ca_biositing/webservice/README.md diff --git a/pixi.lock b/pixi.lock index cff1b773..eb952c35 100644 --- a/pixi.lock +++ b/pixi.lock @@ -5,8 +5,6 @@ environments: - url: https://conda.anaconda.org/conda-forge/ indexes: - https://pypi.org/simple - options: - pypi-prerelease-mode: if-necessary-or-explicit packages: linux-64: - conda: https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2 @@ -1709,8 +1707,6 @@ environments: - url: https://conda.anaconda.org/conda-forge/ indexes: - https://pypi.org/simple - options: - pypi-prerelease-mode: if-necessary-or-explicit packages: linux-64: - conda: https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2 @@ -3071,8 +3067,6 @@ environments: - url: https://conda.anaconda.org/conda-forge/ indexes: - https://pypi.org/simple - options: - pypi-prerelease-mode: if-necessary-or-explicit packages: linux-64: - conda: https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2 @@ -3466,8 +3460,6 @@ environments: - url: https://conda.anaconda.org/conda-forge/ indexes: - https://pypi.org/simple - options: - pypi-prerelease-mode: if-necessary-or-explicit packages: linux-64: - conda: https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2 @@ -4860,8 +4852,6 @@ environments: frontend: channels: - url: https://conda.anaconda.org/conda-forge/ - options: - pypi-prerelease-mode: if-necessary-or-explicit packages: linux-64: - conda: https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2 @@ -4917,8 +4907,6 @@ environments: - url: https://conda.anaconda.org/conda-forge/ indexes: - https://pypi.org/simple - options: - pypi-prerelease-mode: if-necessary-or-explicit packages: linux-64: - conda: https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2 @@ -7217,8 +7205,6 @@ environments: - url: https://conda.anaconda.org/conda-forge/ indexes: - https://pypi.org/simple - options: - pypi-prerelease-mode: if-necessary-or-explicit packages: linux-64: - conda: https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2 @@ -8654,8 +8640,6 @@ environments: - url: https://conda.anaconda.org/conda-forge/ indexes: - https://pypi.org/simple - options: - pypi-prerelease-mode: if-necessary-or-explicit packages: linux-64: - conda: https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2 @@ -10084,8 +10068,6 @@ environments: - url: https://conda.anaconda.org/conda-forge/ indexes: - https://pypi.org/simple - options: - pypi-prerelease-mode: if-necessary-or-explicit packages: linux-64: - conda: https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2 @@ -14518,6 +14500,7 @@ packages: - sqlalchemy>=2.0.0 - sqlmodel>=0.0.19,<0.1 requires_python: '>=3.12' + editable: true - pypi: ./src/ca_biositing/pipeline name: ca-biositing-pipeline version: 0.1.0 @@ -14535,6 +14518,7 @@ packages: - pyogrio - python-dotenv>=1.0.1,<2 requires_python: '>=3.12' + editable: true - pypi: ./src/ca_biositing/webservice name: ca-biositing-webservice version: 0.1.0 @@ -14548,6 +14532,7 @@ packages: - python-multipart>=0.0.9 - uvicorn>=0.30.0,<1 requires_python: '>=3.12' + editable: true - conda: https://conda.anaconda.org/conda-forge/noarch/ca-certificates-2025.10.5-hbd8a1cb_0.conda sha256: 3b5ad78b8bb61b6cdc0978a6a99f8dfb2cc789a451378d054698441005ecbdb6 md5: f9e5fbc24009179e8b0409624691758a From 75d5ce5d6f6f29e04214cc62cd8db6a3e2fc6432 Mon Sep 17 00:00:00 2001 From: mglbleta Date: Sat, 21 Mar 2026 11:26:09 -0700 Subject: [PATCH 2/4] fix(docs): normalize symlink targets for cross-platform checkouts Rewrite docs symlink entries without trailing newline payloads so Windows checkout/restore operations do not produce unreadable links. No content changes to target markdown files; only symlink metadata normalization. --- docs/CODE_OF_CONDUCT.md | 2 +- docs/CONTRIBUTING.md | 2 +- docs/ERD_VIEW.md | 2 +- docs/README.md | 2 +- docs/datamodels/README.md | 2 +- docs/deployment/README.md | 2 +- docs/pipeline/README.md | 2 +- docs/resources/README.md | 2 +- docs/webservice/README.md | 2 +- 9 files changed, 9 insertions(+), 9 deletions(-) diff --git a/docs/CODE_OF_CONDUCT.md b/docs/CODE_OF_CONDUCT.md index b3f4ebcb..0400d574 120000 --- a/docs/CODE_OF_CONDUCT.md +++ b/docs/CODE_OF_CONDUCT.md @@ -1 +1 @@ -../CODE_OF_CONDUCT.md +../CODE_OF_CONDUCT.md \ No newline at end of file diff --git a/docs/CONTRIBUTING.md b/docs/CONTRIBUTING.md index 5c10c35f..44fcc634 120000 --- a/docs/CONTRIBUTING.md +++ b/docs/CONTRIBUTING.md @@ -1 +1 @@ -../CONTRIBUTING.md +../CONTRIBUTING.md \ No newline at end of file diff --git a/docs/ERD_VIEW.md b/docs/ERD_VIEW.md index 2d9c4791..1f7c4c28 120000 --- a/docs/ERD_VIEW.md +++ b/docs/ERD_VIEW.md @@ -1 +1 @@ -../ERD_VIEW.md +../ERD_VIEW.md \ No newline at end of file diff --git a/docs/README.md b/docs/README.md index 94389aee..32d46ee8 120000 --- a/docs/README.md +++ b/docs/README.md @@ -1 +1 @@ -../README.md +../README.md \ No newline at end of file diff --git a/docs/datamodels/README.md b/docs/datamodels/README.md index f3c7aa23..b62101ad 120000 --- a/docs/datamodels/README.md +++ b/docs/datamodels/README.md @@ -1 +1 @@ -../../src/ca_biositing/datamodels/README.md +../../src/ca_biositing/datamodels/README.md \ No newline at end of file diff --git a/docs/deployment/README.md b/docs/deployment/README.md index 5159bcfe..d5c46ad4 120000 --- a/docs/deployment/README.md +++ b/docs/deployment/README.md @@ -1 +1 @@ -../../deployment/README.md +../../deployment/README.md \ No newline at end of file diff --git a/docs/pipeline/README.md b/docs/pipeline/README.md index 544d50b1..82a8c831 120000 --- a/docs/pipeline/README.md +++ b/docs/pipeline/README.md @@ -1 +1 @@ -../../src/ca_biositing/pipeline/README.md +../../src/ca_biositing/pipeline/README.md \ No newline at end of file diff --git a/docs/resources/README.md b/docs/resources/README.md index ea34ca59..9f21ec35 120000 --- a/docs/resources/README.md +++ b/docs/resources/README.md @@ -1 +1 @@ -../../resources/README.md +../../resources/README.md \ No newline at end of file diff --git a/docs/webservice/README.md b/docs/webservice/README.md index 24ea2883..749a635e 120000 --- a/docs/webservice/README.md +++ b/docs/webservice/README.md @@ -1 +1 @@ -../../src/ca_biositing/webservice/README.md +../../src/ca_biositing/webservice/README.md \ No newline at end of file From aff8568eb25f212be5551556da6d72521bb6cdf4 Mon Sep 17 00:00:00 2001 From: mglbleta Date: Sat, 21 Mar 2026 16:23:20 -0700 Subject: [PATCH 3/4] docs: update docs links and workflow guidance - remove ERD_VIEW docs page references/files - update mkdocs repo URL to sustainability-software-lab - refresh pipeline and datamodels readmes for current structure - apply contributing/deployment workflow doc edits --- CONTRIBUTING.md | 4 +- ERD_VIEW.md | 61 --------------------------- docs/ERD_VIEW.md | 1 - docs/pipeline/ALEMBIC_WORKFLOW.md | 8 ++-- docs/pipeline/ETL_WORKFLOW.md | 33 +++++++++------ mkdocs.yml | 5 +-- src/ca_biositing/datamodels/README.md | 24 +++++------ src/ca_biositing/pipeline/README.md | 2 +- 8 files changed, 42 insertions(+), 96 deletions(-) delete mode 100644 ERD_VIEW.md delete mode 120000 docs/ERD_VIEW.md diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 4f210ccd..3a17e040 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -39,8 +39,8 @@ Your contributions make this project betterβ€”thank you for your support! πŸš€ 1. Set up your development environment with `pixi install`. 2. Install pre-commit hooks with `pixi run pre-commit-install`. 3. Create a feature branch. -4. Make your changes and ensure tests and pre-commit checks pass. . Submit a - pull request. +4. Make your changes and ensure tests and pre-commit checks pass. Submit a pull + request. ### Configuring Pre-commit diff --git a/ERD_VIEW.md b/ERD_VIEW.md deleted file mode 100644 index 05b1930f..00000000 --- a/ERD_VIEW.md +++ /dev/null @@ -1,61 +0,0 @@ -# ERD for USDA Census and Survey Data - -To view this diagram, open the Command Palette (`Cmd+Shift+P` on Mac or -`Ctrl+Shift+P` on Windows/Linux) and run **"Markdown: Open Preview to the -Side"**. - -```mermaid -erDiagram -CensusRecord { - integer year - CropEnum crop - VariableEnum variable - UnitEnum unit - float value - BearingStatusEnum bearing_status - string class_desc - string domain_desc - string source - string notes -} -Geography { - string state_name - string state_fips - string county_name - string county_fips - string geoid - string region_name - string agg_level_desc -} -SurveyRecord { - string period_desc - string freq_desc - string program_desc - integer year - CropEnum crop - VariableEnum variable - UnitEnum unit - float value - BearingStatusEnum bearing_status - string class_desc - string domain_desc - string source - string notes -} -UsdaRecord { - integer year - CropEnum crop - VariableEnum variable - UnitEnum unit - float value - BearingStatusEnum bearing_status - string class_desc - string domain_desc - string source - string notes -} - -CensusRecord ||--|o Geography : "geography" -SurveyRecord ||--|o Geography : "geography" -UsdaRecord ||--|o Geography : "geography" -``` diff --git a/docs/ERD_VIEW.md b/docs/ERD_VIEW.md deleted file mode 120000 index 1f7c4c28..00000000 --- a/docs/ERD_VIEW.md +++ /dev/null @@ -1 +0,0 @@ -../ERD_VIEW.md \ No newline at end of file diff --git a/docs/pipeline/ALEMBIC_WORKFLOW.md b/docs/pipeline/ALEMBIC_WORKFLOW.md index ff048753..dd290239 100644 --- a/docs/pipeline/ALEMBIC_WORKFLOW.md +++ b/docs/pipeline/ALEMBIC_WORKFLOW.md @@ -12,9 +12,11 @@ systematic and version-controlled way. allows you to modify your database schema (e.g., add a new table or column) and keep a versioned history of those changes. - **Why use it?** It prevents you from having to manually write SQL - `ALTER TABLE` statements. It automatically compares your SQLModel classes to - the current state of the database and generates the necessary migration - scripts. + `ALTER TABLE` statements which are not tracked in version control. Alembic + generates SQL code from the Python SQLModel schema to prevent manual errors. + It also automatically compares your SQLModel classes to the current state of + the database and generates the necessary migration scripts. This reduces + database drift. --- diff --git a/docs/pipeline/ETL_WORKFLOW.md b/docs/pipeline/ETL_WORKFLOW.md index b04f34e8..b6e15b63 100644 --- a/docs/pipeline/ETL_WORKFLOW.md +++ b/docs/pipeline/ETL_WORKFLOW.md @@ -17,8 +17,9 @@ and loads it into the PostgreSQL database. - `load`: Functions to insert the transformed data into the database using SQLAlchemy. -- **Hierarchical Pipelines:** Individual pipelines are nested within - subdirectories reflecting the data they handle (e.g., `products`, `biomass`). +- **Hierarchical Pipelines:** Transform and load logic are organized into + subdirectories reflecting the data they handle (e.g., `products`, `usda`, + `analysis`). --- @@ -32,19 +33,25 @@ The ETL system runs in a containerized Prefect environment. pixi run start-services ``` -**Step 2: Deploy Flows** +**Step 2: Apply Datamodel** + +```bash +pixi run migrate +``` + +**Step 3: Deploy Flows** ```bash pixi run deploy ``` -**Step 3: Run the Master Pipeline** +**Step 4: Run the Master Pipeline** ```bash pixi run run-etl ``` -**Step 4: Monitor** Access the Prefect UI at +**Step 5: Monitor** Access the Prefect UI at [http://localhost:4200](http://localhost:4200). --- @@ -52,21 +59,23 @@ pixi run run-etl ### How to Add a New ETL Flow **Step 1: Create the Task Files** Create the three Python files for your -extract, transform, and load logic in the appropriate subdirectories under -`src/ca_biositing/pipeline/ca_biositing/pipeline/etl/`. Decorate each function -with `@task`. +extract, transform, and load logic under +`src/ca_biositing/pipeline/ca_biositing/pipeline/etl/`. Extract tasks go +directly in `extract/`; transform and load tasks go in appropriately named +subdirectories (e.g., `transform/products/`, `load/products/`). Decorate each +function with `@task`. **Step 2: Create the Pipeline Flow** Create a new file in `src/ca_biositing/pipeline/ca_biositing/pipeline/flows/` to define the flow. ```python from prefect import flow -from ca_biositing.pipeline.etl.extract.samples.new_type import extract -from ca_biositing.pipeline.etl.transform.samples.new_type import transform -from ca_biositing.pipeline.etl.load.samples.new_type import load +from ca_biositing.pipeline.etl.extract.my_source import extract +from ca_biositing.pipeline.etl.transform.products.my_product import transform +from ca_biositing.pipeline.etl.load.products.my_product import load @flow -def new_type_flow(): +def my_product_flow(): raw_data = extract() transformed_data = transform(raw_data) load(transformed_data) diff --git a/mkdocs.yml b/mkdocs.yml index c96fcb90..03db49b3 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -1,5 +1,5 @@ site_name: CA-BioSiting -repo_url: https://github.com/uw-ssec/ca-biositing +repo_url: https://github.com/sustainability-software-lab/ca-biositing theme: name: material @@ -51,11 +51,11 @@ nav: - Notebook Setup: notebook_setup.md - Pipeline: - Overview: pipeline/README.md + - GCP Setup: pipeline/GCP_SETUP.md - ETL Workflow: pipeline/ETL_WORKFLOW.md - Alembic Workflow: pipeline/ALEMBIC_WORKFLOW.md - Docker Workflow: pipeline/DOCKER_WORKFLOW.md - Prefect Workflow: pipeline/PREFECT_WORKFLOW.md - - GCP Setup: pipeline/GCP_SETUP.md - USDA ETL Guide: pipeline/USDA/USDA_ETL_GUIDE.md - Datamodels: - Overview: datamodels/README.md @@ -71,4 +71,3 @@ nav: - Deployment: deployment/README.md - Contributing: CONTRIBUTING.md - Code of Conduct: CODE_OF_CONDUCT.md - - ERD View: ERD_VIEW.md diff --git a/src/ca_biositing/datamodels/README.md b/src/ca_biositing/datamodels/README.md index b5480e8d..532114b5 100644 --- a/src/ca_biositing/datamodels/README.md +++ b/src/ca_biositing/datamodels/README.md @@ -9,11 +9,12 @@ etc.). The `ca_biositing.datamodels` package provides: -- **Hand-Written SQLModel Classes**: 91 models organized across 15 domain +- **Hand-Written SQLModel Classes**: Models organized across domain subdirectories, combining SQLAlchemy ORM and Pydantic validation in a single class hierarchy. -- **Materialized Views**: 7 analytical views defined as SQLAlchemy Core - `select()` expressions, managed via Alembic migrations. +- **Materialized Views**: Analytical views defined as SQLAlchemy Core `select()` + expressions (plus one SQL-based aggregate view), managed via Alembic + migrations. - **Database Configuration**: SQLModel-based engine and session management with Docker-aware URL adjustment. - **Model Configuration**: Shared configuration for model behavior using @@ -79,12 +80,13 @@ src/ca_biositing/datamodels/ β”‚ β”œβ”€β”€ __init__.py # Package initialization and version β”‚ β”œβ”€β”€ config.py # Model configuration (Pydantic Settings) β”‚ β”œβ”€β”€ database.py # SQLModel engine and session management -β”‚ β”œβ”€β”€ views.py # Materialized view definitions (7 views) +β”‚ β”œβ”€β”€ views.py # Materialized view definitions β”‚ β”œβ”€β”€ models/ # Hand-written SQLModel classes -β”‚ β”‚ β”œβ”€β”€ __init__.py # Central re-export of all 91 models +β”‚ β”‚ β”œβ”€β”€ __init__.py # Central re-export of models β”‚ β”‚ β”œβ”€β”€ base.py # Base classes (BaseEntity, LookupBase, etc.) β”‚ β”‚ β”œβ”€β”€ aim1_records/ # Aim 1 analytical records β”‚ β”‚ β”œβ”€β”€ aim2_records/ # Aim 2 processing records +β”‚ β”‚ β”œβ”€β”€ auth/ # API users and authentication models β”‚ β”‚ β”œβ”€β”€ core/ # ETL lineage and run tracking β”‚ β”‚ β”œβ”€β”€ data_sources_metadata/ # Data source and dataset metadata β”‚ β”‚ β”œβ”€β”€ experiment_equipment/ # Experiments and equipment @@ -93,7 +95,6 @@ src/ca_biositing/datamodels/ β”‚ β”‚ β”œβ”€β”€ general_analysis/ # Observations and analysis types β”‚ β”‚ β”œβ”€β”€ infrastructure/ # Infrastructure facility records β”‚ β”‚ β”œβ”€β”€ methods_parameters_units/ # Methods, parameters, units -β”‚ β”‚ β”œβ”€β”€ misc/ # Additional infrastructure models β”‚ β”‚ β”œβ”€β”€ people/ # Contacts and providers β”‚ β”‚ β”œβ”€β”€ places/ # Location and address models β”‚ β”‚ β”œβ”€β”€ resource_information/ # Resources, availability, strains @@ -102,8 +103,6 @@ src/ca_biositing/datamodels/ β”œβ”€β”€ tests/ β”‚ β”œβ”€β”€ __init__.py β”‚ β”œβ”€β”€ conftest.py # Pytest fixtures and configuration -β”‚ β”œβ”€β”€ test_biomass.py # Tests for biomass models -β”‚ β”œβ”€β”€ test_geographic_locations.py # Tests for location models β”‚ β”œβ”€β”€ test_package.py # Tests for package metadata β”‚ └── README.md # Test documentation β”œβ”€β”€ LICENSE # BSD License @@ -208,7 +207,7 @@ pixi run pytest src/ca_biositing/datamodels -v ### Run specific test files ```bash -pixi run pytest src/ca_biositing/datamodels/tests/test_biomass.py -v +pixi run pytest src/ca_biositing/datamodels/tests/test_package.py -v ``` ### Run with coverage @@ -221,18 +220,17 @@ See `tests/README.md` for detailed information about the test suite. ## Model Categories -The models are organized into 15 domain subdirectories under `models/`: +The models are organized into domain subdirectories under `models/`: ### Core and Infrastructure - **`base.py`**: Base classes shared across all models (`BaseEntity`, `LookupBase`, `Aim1RecordBase`, `Aim2RecordBase`). +- **`auth/`**: API authentication models (`ApiUser`). - **`core/`**: ETL run tracking and lineage (`EtlRun`, `EntityLineage`, `LineageGroup`). - **`infrastructure/`**: Infrastructure facility records (biodiesel plants, landfills, ethanol biorefineries, etc.). -- **`misc/`**: Additional infrastructure models (MSW digesters, SAF plants, - wastewater treatment). - **`places/`**: Location and address models (`Place`, `LocationAddress`, `LocationResolution`). - **`people/`**: Contact and provider information (`Contact`, `Provider`). @@ -312,7 +310,7 @@ pixi run pre-commit run --files src/ca_biositing/datamodels/**/* - **Version**: 0.1.0 - **Python**: >= 3.12 - **License**: BSD License -- **Repository**: +- **Repository**: ## Contributing diff --git a/src/ca_biositing/pipeline/README.md b/src/ca_biositing/pipeline/README.md index 56b8bc31..444ac111 100644 --- a/src/ca_biositing/pipeline/README.md +++ b/src/ca_biositing/pipeline/README.md @@ -326,7 +326,7 @@ following the project conventions. - **Version**: 0.1.0 - **Python**: >= 3.12 - **License**: BSD License -- **Repository**: +- **Repository**: ## Contributing From ca5ea30a8b2300ced7305945cdcf44decb0513a5 Mon Sep 17 00:00:00 2001 From: mglbleta Date: Tue, 31 Mar 2026 13:18:53 -0700 Subject: [PATCH 4/4] Adding back an improved overview section to architecture.md Previous overneiw was redundant with documentation overview page and contained misleading language. This new version maintains benefits of an overview paragraph while removing misleading language. Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --- docs/architecture.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/architecture.md b/docs/architecture.md index 95c5bafb..cd802e57 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -1,5 +1,9 @@ # CA-Biositing Architecture Documentation +## Overview + +The CA-Biositing system ingests agricultural and geospatial data from multiple external sources to support biomass siting analysis and related decision-making workflows. This architecture document describes how data flows through ETL pipelines, is validated and stored in relational and geospatial databases, and is orchestrated using workflow tooling. The diagram below provides a high-level view of the core services, data stores, and integrations that make up the platform. + ## System Architecture Diagram ```mermaid