Skip to content

Fix partitioning regression#49

Merged
victorlujanearthranger merged 3 commits intomainfrom
fix-partitioning-regression
Mar 30, 2026
Merged

Fix partitioning regression#49
victorlujanearthranger merged 3 commits intomainfrom
fix-partitioning-regression

Conversation

@chrisdoehring
Copy link
Copy Markdown
Contributor

This pull request modernizes the project by migrating from Poetry to PEP 621 (pyproject.toml) metadata and Hatch for builds, introduces a GitHub Actions CI workflow, and makes the core connector base class easier to test by allowing dependency injection. It also adds a comprehensive CLAUDE.md guide for contributors using Claude Code.

Build system and dependency management modernization:

  • Migrated from Poetry to PEP 621-compliant metadata in pyproject.toml, using Hatch as the build backend and updating dependency and dev dependency declarations accordingly. This simplifies dependency management and aligns with modern Python packaging standards.

Continuous integration improvements:

  • Added a new GitHub Actions workflow (.github/workflows/ci.yml) to automate dependency installation, run tests, and ensure CI coverage using the uv package manager.

Developer experience and documentation:

  • Added CLAUDE.md, a detailed guide for using Claude Code with this repository, covering project architecture, build/dev instructions, and connector patterns.

Codebase maintainability and testability:

  • Updated AbstractConnector in cdip_connector/core/connector_base.py to accept an optional PortalApi instance, enabling easier dependency injection for testing and subclassing.
  • Ensured that when creating concurrent connector tasks, the same PortalApi instance is reused, improving consistency and testability.

Minor improvements:

  • Added missing import for Optional in cdip_connector/core/connector_base.py.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR modernizes the repository’s Python packaging and CI setup (moving from Poetry to PEP 621 + Hatchling and introducing uv-based CI), and updates the connector base class to support PortalApi dependency injection to improve testability.

Changes:

  • Migrate packaging metadata to PEP 621 in pyproject.toml and switch build backend to Hatchling; remove poetry.lock.
  • Add GitHub Actions CI workflow to install dependencies via uv and run pytest.
  • Update AbstractConnector to accept an optional PortalApi instance and reuse it across concurrent connector tasks.

Reviewed changes

Copilot reviewed 4 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
pyproject.toml Replaces Poetry config with PEP 621 project metadata, dependency declarations, and Hatchling build backend.
poetry.lock Removed Poetry lockfile as part of packaging tool migration.
cdip_connector/core/connector_base.py Adds PortalApi dependency injection and reuses the same portal instance when spawning concurrent tasks.
CLAUDE.md Adds contributor guidance and architecture notes for Claude Code usage.
.github/workflows/ci.yml Introduces a uv-based CI job that installs deps and runs tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 66 to 70
for idx in range(0, len(integrations), self.concurrency):
tasks = [
asyncio.ensure_future(self.__class__().extract_load(integration))
asyncio.ensure_future(self.__class__(portal=self.portal).extract_load(integration))
for integration in integrations[idx: idx + self.concurrency]
]
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

main() constructs per-integration tasks with self.__class__(portal=self.portal). This will break any connector subclass that overrides __init__ without accepting a portal kwarg (a common pattern), making this a potentially breaking API change for downstream connectors. Consider instantiating without kwargs and then assigning portal, or provide a dedicated factory/clone method in the base class that injects the shared PortalApi without requiring subclass constructor changes.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to 4
[project]
name = "cdip_connector"
version = "1.5.7"
version = "1.7.0"
description = "SMART Integrate Connector Library"
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR title (“Fix partitioning regression”) doesn’t match the actual changes here (packaging migration to Hatch/PEP 621, new CI workflow, contributor docs, and connector DI). Please align the PR title (or description) with the scope of the changes so release notes/reviews aren’t misleading.

Copilot uses AI. Check for mistakes.
Comment on lines +20 to +22
- run: uv sync --group dev

- run: uv run pytest
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI runs uv sync --group dev without enforcing that uv.lock is up-to-date (e.g., via --locked/--frozen). That means dependency drift can slip in (pyproject changes not reflected in the lockfile) while CI still passes. Consider making CI fail when the lockfile is missing/out-of-sync.

Copilot uses AI. Check for mistakes.
Comment on lines +8 to +14
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- uses: astral-sh/setup-uv@v5

Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow doesn’t pin a Python version. Since ubuntu-latest can change its default Python over time, CI results may unexpectedly change or fail even if the project hasn’t. Consider explicitly selecting a Python version (or a test matrix) consistent with requires-python.

Copilot uses AI. Check for mistakes.
- **`cdip_settings.py`** — All configuration via environment variables using `environs`. Includes Keycloak auth settings, API endpoints, GCP settings, and job partitioning (JOB_COMPLETION_INDEX/COUNT). Can load from a custom env file via `CDIP_SDK_ENVFILE`.
- **`schemas/__init__.py`** — Re-exports all schemas from `gundi_core.schemas` for backward compatibility.
- **`routing.py`** — Pub/Sub topic name definitions for the observation processing pipeline.
- **`cloudstorage.py`** — Abstract `CloudStorage` with `GoogleCloudStorage` and `LocalStorage` implementations for camera trap image handling.
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLAUDE.md references a GoogleCloudStorage implementation, but the current class in cdip_connector/core/cloudstorage.py is named GoogleCouldStorage (note the missing “l”). This mismatch can mislead contributors; either update the doc to match the actual symbol name or rename the class for consistency.

Suggested change
- **`cloudstorage.py`** — Abstract `CloudStorage` with `GoogleCloudStorage` and `LocalStorage` implementations for camera trap image handling.
- **`cloudstorage.py`** — Abstract `CloudStorage` with `GoogleCouldStorage` and `LocalStorage` implementations for camera trap image handling.

Copilot uses AI. Check for mistakes.
"opentelemetry-propagator-gcp>=1.3.0,<2",
"opentelemetry-instrumentation-requests>=0.35b0,<1",
"opentelemetry-exporter-otlp-proto-http>=1.14.0,<2",
"opentelemetry-instrumentation-httpx>=0.35b0,<1",
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cdip_connector imports and uses httpx (e.g., AsyncClient in connector_base.py), but httpx is not declared in [project].dependencies. Relying on it as a transitive dependency (e.g., via gundi-client) can break installs if upstream changes; add an explicit httpx requirement here.

Suggested change
"opentelemetry-instrumentation-httpx>=0.35b0,<1",
"opentelemetry-instrumentation-httpx>=0.35b0,<1",
"httpx>=0.23.0,<1",

Copilot uses AI. Check for mistakes.
@victorlujanearthranger victorlujanearthranger merged commit a6e3c28 into main Mar 30, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants