diff --git a/CHANGELOG.md b/CHANGELOG.md index 55946d6..46f34eb 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,33 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] +## [0.6.0] - 2025-12-17 + +### ⚠️ BREAKING CHANGES + +- **Event subscription API:** `manager.on()` now returns a `Subscription` handle with `unsubscribe()`. The previous `manager.off()` API has been removed. Event names are now a typed `DownloadEventType` `StrEnum`. +- **HTTP client dependency:** `DownloadManager`, `WorkerPool`, and `DownloadWorker` now depend on a `BaseHttpClient` abstraction (default `AiohttpClient`) instead of raw `aiohttp.ClientSession`. Custom clients must implement `BaseHttpClient`. + +### Added + +- **Selective cancellation:** `DownloadManager.cancel(download_id) -> CancelResult` with cooperative cancellation for queued downloads and task-level cancellation for in-progress downloads. `DownloadCancelledEvent` now carries `cancelled_from` to distinguish queued vs in-progress cancellations. +- **Cancellation observability:** `DownloadStats` tracks `cancelled` count for manager.stats. +- **File exists strategy policy:** `DestinationResolver` + `FileExistsPolicy` centralise file-exists handling with `default_file_exists_strategy`. +- **Typed events:** `DownloadEventType` enum for event names and `Subscription` handle returned from `manager.on()`. +- **HTTP client abstraction:** `BaseHttpClient` ABC and `AiohttpClient` implementation with SSL factory helpers; supports context-managed and manual lifecycle. +- **Benchmarks:** Added minimal `benchmarks/` suite for future extension, with HTTP fixture server, and pytest-benchmark dependency for throughput profiling. + +### Changed + +- **Worker wiring:** Workers create and own `DestinationResolver`, simplifying pool wiring; skips are emitted when resolver returns `None`. +- **Exception hierarchy:** New `RheoError` base with `InfrastructureError` → `HttpClientError` → `ClientNotInitialisedError`; `DownloadManagerError` now derives from `RheoError`. +- **CI:** Poetry 2.x install flags updated (`--all-groups`), reordered cache steps, and added version diagnostics. +- **Docs & examples:** Updated to show `DownloadEventType`, subscription handles, and selective cancellation usage. + +### Removed + +- `manager.off()` (superseded by `Subscription.unsubscribe()`). + ## [0.5.0] - 2025-12-09 ### ⚠️ BREAKING CHANGES diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 9690ef8..3d73a16 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,6 +1,6 @@ # Contributing to Rheo -Thanks for your interest in contributing! We care about code quality and clear communication, but we're not fussy about how you get there. Make the codebase better, and we'll be very happy and grateful. +Thanks for your interest in contributing! We care about code quality and clear communication, but we're not fussy about how you get there. Make the codebase better, and we'll be very grateful. ## Development Setup @@ -90,7 +90,7 @@ Don't worry about memorising everything though, reviews will guide you. ## Commit Messages -Good commit messages are appreciated but not a blocker at all. If you want to follow conventional commits (`feat:`, `fix:`, etc.), great. If not, just be clear about what changed. +Good commit messages are appreciated but not a blocker at all. Good enough: @@ -103,7 +103,7 @@ Update README examples ## Running Examples ```bash -make examples # Verify examples still work +poetry run examples/{example_file}.py ``` ## Getting Help diff --git a/README.md b/README.md index 6207e92..3ac6798 100644 --- a/README.md +++ b/README.md @@ -50,7 +50,8 @@ asyncio.run(main()) - Retry logic with exponential backoff - Real-time speed & ETA tracking - File exists handling (skip, overwrite, or error) -- Event-driven architecture with `manager.on()`/`off()` +- Event-driven architecture with typed `DownloadEventType` and `Subscription` handles (`manager.on()` returns a handle with `unsubscribe()`) +- HTTP client abstraction (`BaseHttpClient`, default `AiohttpClient`) - CLI tool (`rheo download`) - Full type hints diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index 4818a0f..5397898 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -87,7 +87,7 @@ Key pieces: - Entry point for the library - Orchestrates high-level download operations -- Initialises HTTP client +- Owns or accepts an injected HTTP client via `BaseHttpClient` (default `AiohttpClient`); only calls `open()`/`close()` on clients it owns - Creates and owns all event wiring (queue and worker events to tracker) - Creates a single shared `EventEmitter` and exposes it as an event facade via `on()` (returns `Subscription` handle) - Delegates worker lifecycle to `WorkerPool` @@ -104,6 +104,7 @@ Key pieces: - Re-queues unstarted downloads during shutdown - Maintains task lifecycle and cleanup - Supports selective cancellation of individual downloads (queued or in-progress) +- Accepts `BaseHttpClient` for workers (manager passes the shared client) **DownloadWorker**: @@ -253,7 +254,7 @@ config4 = FileConfig(url="https://example.com/file.zip", destination_subdir="dir ### Infrastructure Layer -**What it does**: Cross-cutting concerns like logging. +**What it does**: Cross-cutting concerns like logging and HTTP client abstraction. **Logging**: @@ -262,7 +263,14 @@ config4 = FileConfig(url="https://example.com/file.zip", destination_subdir="dir - Injected as dependency (testable) - Each component gets its own logger -**Why**: Centralised, consistent logging. Easy to test without output noise. +**HTTP Client**: + +- `BaseHttpClient` ABC defines open/close/closed/get contract +- Default implementation: `AiohttpClient` with SSL factory helpers +- Manager only opens/closes clients it owns; injected clients respected +- Shared across pool and workers for connection reuse + +**Why**: Centralised, consistent logging and a swappable HTTP client layer for testability and future client implementations. ### CLI Layer @@ -372,7 +380,7 @@ config4 = FileConfig(url="https://example.com/file.zip", destination_subdir="dir ```text 1. Worker completes file download successfully 2. If FileConfig has hash_config: - a. Worker emits worker.validation_started event + a. Worker emits `download.validating` event b. Worker calls validator.validate(file_path, hash_config) c. Validator calculates hash in thread pool (via asyncio.to_thread): - Opens file in binary mode @@ -381,16 +389,15 @@ config4 = FileConfig(url="https://example.com/file.zip", destination_subdir="dir - Returns hexadecimal hash d. Validator compares hashes using hmac.compare_digest (constant-time) e. If hashes match: - - Worker emits worker.validation_completed event - Worker emits download.completed event f. If hashes don't match: - - Worker emits worker.validation_failed event + - Worker emits download.failed event with validation context - Worker deletes corrupted file - Worker raises HashMismatchError (not retried by default) 3. Tracker observes validation events, updates DownloadInfo.validation ``` -Note: Validation events still use `worker.*` namespace and will be renamed to `download.*` in a future release. +Note: Validation events use the `download.*` namespace. ### Cancellation Flow diff --git a/docs/README.md b/docs/README.md index 4d0673f..821c00b 100644 --- a/docs/README.md +++ b/docs/README.md @@ -52,8 +52,9 @@ That's it. The manager handles worker pools, state tracking, and cleanup automat - **Speed & ETA tracking**: Real-time download speed with moving averages and estimated completion time - **Graceful shutdown**: Stop downloads cleanly or cancel immediately - **File exists handling**: Skip, overwrite, or error when destination exists (configurable per-file or globally) -- **Event system**: React to download lifecycle events (queued, started, progress, speed, completed, failed, skipped, cancelled, retry, validation) +- **Event system**: Typed events (`DownloadEventType`) with `Subscription` handles from `manager.on()` (queued, started, progress, completed, failed, skipped, cancelled, retry, validation) - **Progress tracking**: Track bytes downloaded, completion status, errors, validation results, destination paths, and final average speeds +- **HTTP client abstraction**: `BaseHttpClient` with default `AiohttpClient` implementation - **Async/await**: Built on asyncio for efficient I/O - **Type hints**: Full type annotations throughout - **Dependency injection**: Easy to test and customise @@ -360,31 +361,32 @@ files = [ ### Retry on Transient Errors -Automatic retry with exponential backoff - just provide a config: +Automatic retry with exponential backoff for transient errors (500, 503, timeouts, etc.). + +> **Note**: Retry configuration is not yet available at the `DownloadManager` level. Currently, enabling retries requires direct worker instantiation as shown below. Manager-level retry configuration is planned for a future release. ```python from rheo.domain.retry import RetryConfig from rheo.downloads import RetryHandler, DownloadWorker -# Simple: just specify retry config (sensible defaults for everything else) +# Configure retry behaviour config = RetryConfig(max_retries=3, base_delay=1.0, max_delay=60.0) retry_handler = RetryHandler(config) -# Create worker with retry support +# Create worker with retry support (bypasses manager) async with aiohttp.ClientSession() as session: worker = DownloadWorker( client=session, retry_handler=retry_handler, ) - # Worker will automatically retry transient errors (500, 503, timeouts, etc.) - await worker.download(url, destination_path) + await worker.download(url, destination_path, download_id="my-download") ``` -**Advanced**: Customize retry policy for specific status codes: +**Advanced**: Customise retry policy for specific status codes: ```python from rheo.domain.retry import RetryConfig, RetryPolicy -from rheo.downloads import RetryHandler, ErrorCategoriser +from rheo.downloads import RetryHandler # Custom policy - treat 404 as transient (normally permanent) policy = RetryPolicy( diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md index a868fee..e1ddab0 100644 --- a/docs/ROADMAP.md +++ b/docs/ROADMAP.md @@ -11,7 +11,7 @@ The library and CLI are working for real use. - Concurrent downloads with worker pool - Priority queue system - Selective cancellation (cancel individual downloads by ID) -- Event-driven architecture with `manager.on()`/`off()` subscription +- Event-driven architecture with typed events and `Subscription` handles from `manager.on()` - Download tracking and state management - Comprehensive error handling - Retry logic with exponential backoff diff --git a/pyproject.toml b/pyproject.toml index b2133b8..78be67a 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [project] name = "rheopy" -version = "0.5.0" +version = "0.6.0" description = "Concurrent HTTP download orchestration with async I/O" authors = [ {name = "plutopulp", email = "plutopulped@gmail.com"}