Skip to content

Feature/copernicus pr1 core#6

Merged
rfievet merged 3 commits intomainfrom
feature/copernicus-pr1-core
Feb 18, 2026
Merged

Feature/copernicus pr1 core#6
rfievet merged 3 commits intomainfrom
feature/copernicus-pr1-core

Conversation

@rfievet
Copy link
Copy Markdown
Collaborator

@rfievet rfievet commented Jan 19, 2026

Summary

This PR adds the foundational Copernicus Data Space Ecosystem client for fetching Sentinel-2 satellite imagery. This is the first of 3 PRs to add complete Copernicus support to Galileo as an alternative data source to Google Earth Engine.

What's Included

Core Infrastructure

  • OAuth2 Authentication: Client credentials flow with automatic token refresh and 5-minute expiry buffer
  • Caching System: Deterministic cache keys for search results and downloaded files with automatic reuse
  • Input Validation: Comprehensive validation for bounding boxes, dates, resolution, and cloud cover parameters
  • Robust Downloads: HTTP streaming with retry logic, exponential backoff, and resume capability using Range requests
  • Session Management: Connection pooling via requests.Session for improved performance

S2 Functionality

  • Product Search: OData API queries with spatial/temporal/cloud cover filtering
  • Product Download: Complete S2 product downloads (500MB-1GB per product) with progress tracking
  • Metadata Creation: JSON metadata files for products when download is disabled
  • Interactive Mode: User confirmation prompts before large downloads

Supporting Files

  • .env.example - Template for Copernicus credentials
  • tests/test_copernicus_s2_basic.py - 18 unit tests with mocked API calls
  • tests/manual_test_s2_download.py - Manual integration test script

Architecture

src/data/copernicus/
├── client.py          # Main client with OAuth and S2 support
├── s2.py              # S2-specific search and download logic
├── utils.py           # Validation and helper functions
├── download_utils.py  # Robust download with retry/resume
└── __init__.py        # Module exports

The client uses a modular design where:

  • CopernicusClient handles authentication and coordination
  • s2.py contains all S2-specific logic
  • download_utils.py provides reusable download functionality
  • utils.py contains shared validation and conversion functions

Usage Example

from src.data.copernicus import CopernicusClient

# Initialize client (reads credentials from .env)
client = CopernicusClient()

# Fetch Sentinel-2 data
s2_files = client.fetch_s2(
    bbox=[6.15, 49.11, 6.16, 49.12],
    start_date="2024-01-01",
    end_date="2024-01-31",
    max_cloud_cover=20.0,
    product_type="S2MSI2A",  # Level-2A (atmospherically corrected)
    download_data=True,
    max_products=3
)

print(f"Downloaded {len(s2_files)} S2 products")

Testing

Unit Tests (18/18 passing)

  • Input validation (bbox, dates, resolution, cloud cover)
  • Cache key generation and determinism
  • OAuth token flow (mocked)
  • Error handling and edge cases
  • S2 search query construction

Integration Tests (Verified)

  • ✅ Real OAuth2 authentication with Copernicus API
  • ✅ Product search returning actual S2 products
  • ✅ Complete download of 733MB S2 product
  • ✅ ZIP file integrity verified (82 files including spectral bands)
  • ✅ Caching working correctly (instant subsequent requests)
  • ✅ Download speed: 20.6 MB/s with connection pooling

Test Coverage

# Run unit tests
uv run pytest tests/test_copernicus_s2_basic.py -v

# Run manual integration test (requires credentials)
python tests/manual_test_s2_download.py           # Search only
python tests/manual_test_s2_download.py --download # With download

Dependencies Added

  • python-dotenv>=1.0.0 - For loading .env configuration files
  • requests>=2.31.0 - For HTTP requests (was only type stubs before)

What's NOT in This PR

As specified in the PR split plan, this PR intentionally excludes:

Breaking Changes

None. This is a new feature that doesn't affect existing functionality.

Credentials Setup

Users need free Copernicus credentials:

  1. Register at https://dataspace.copernicus.eu/
  2. Create OAuth client credentials
  3. Add to .env file:
COPERNICUS_CLIENT_ID=your_client_id
COPERNICUS_CLIENT_SECRET=your_client_secret

Next Steps

Related Issues

Part of the Copernicus Data Space Ecosystem integration to provide an alternative to Google Earth Engine for data fetching.

- Add OAuth2 authentication with token refresh
- Add S2 product search and download
- Add caching for search results and downloads
- Add robust download with retry and resume
- Add input validation and utilities
- Add basic S2 tests
- Use client.session.get() for connection pooling
- Improves download speed (20.6 MB/s vs 6.6 MB/s)
- Maintains session consistency across all requests
- Add python-dotenv>=1.0.0 for .env file loading
- Add requests>=2.31.0 for HTTP requests
- Fixes CI/CD build failure: ModuleNotFoundError: No module named 'dotenv'
@rfievet rfievet marked this pull request as ready for review January 19, 2026 14:53
@rfievet rfievet merged commit 57676fa into main Feb 18, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant