Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,33 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

## [0.6.0] - 2025-12-17

### ⚠️ BREAKING CHANGES

- **Event subscription API:** `manager.on()` now returns a `Subscription` handle with `unsubscribe()`. The previous `manager.off()` API has been removed. Event names are now a typed `DownloadEventType` `StrEnum`.
- **HTTP client dependency:** `DownloadManager`, `WorkerPool`, and `DownloadWorker` now depend on a `BaseHttpClient` abstraction (default `AiohttpClient`) instead of raw `aiohttp.ClientSession`. Custom clients must implement `BaseHttpClient`.

### Added

- **Selective cancellation:** `DownloadManager.cancel(download_id) -> CancelResult` with cooperative cancellation for queued downloads and task-level cancellation for in-progress downloads. `DownloadCancelledEvent` now carries `cancelled_from` to distinguish queued vs in-progress cancellations.
- **Cancellation observability:** `DownloadStats` tracks `cancelled` count for manager.stats.
- **File exists strategy policy:** `DestinationResolver` + `FileExistsPolicy` centralise file-exists handling with `default_file_exists_strategy`.
- **Typed events:** `DownloadEventType` enum for event names and `Subscription` handle returned from `manager.on()`.
- **HTTP client abstraction:** `BaseHttpClient` ABC and `AiohttpClient` implementation with SSL factory helpers; supports context-managed and manual lifecycle.
- **Benchmarks:** Added minimal `benchmarks/` suite for future extension, with HTTP fixture server, and pytest-benchmark dependency for throughput profiling.

### Changed

- **Worker wiring:** Workers create and own `DestinationResolver`, simplifying pool wiring; skips are emitted when resolver returns `None`.
- **Exception hierarchy:** New `RheoError` base with `InfrastructureError` → `HttpClientError` → `ClientNotInitialisedError`; `DownloadManagerError` now derives from `RheoError`.
- **CI:** Poetry 2.x install flags updated (`--all-groups`), reordered cache steps, and added version diagnostics.
- **Docs & examples:** Updated to show `DownloadEventType`, subscription handles, and selective cancellation usage.

### Removed

- `manager.off()` (superseded by `Subscription.unsubscribe()`).

## [0.5.0] - 2025-12-09

### ⚠️ BREAKING CHANGES
Expand Down
6 changes: 3 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Contributing to Rheo

Thanks for your interest in contributing! We care about code quality and clear communication, but we're not fussy about how you get there. Make the codebase better, and we'll be very happy and grateful.
Thanks for your interest in contributing! We care about code quality and clear communication, but we're not fussy about how you get there. Make the codebase better, and we'll be very grateful.

## Development Setup

Expand Down Expand Up @@ -90,7 +90,7 @@ Don't worry about memorising everything though, reviews will guide you.

## Commit Messages

Good commit messages are appreciated but not a blocker at all. If you want to follow conventional commits (`feat:`, `fix:`, etc.), great. If not, just be clear about what changed.
Good commit messages are appreciated but not a blocker at all.

Good enough:

Expand All @@ -103,7 +103,7 @@ Update README examples
## Running Examples

```bash
make examples # Verify examples still work
poetry run examples/{example_file}.py
```

## Getting Help
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,8 @@ asyncio.run(main())
- Retry logic with exponential backoff
- Real-time speed & ETA tracking
- File exists handling (skip, overwrite, or error)
- Event-driven architecture with `manager.on()`/`off()`
- Event-driven architecture with typed `DownloadEventType` and `Subscription` handles (`manager.on()` returns a handle with `unsubscribe()`)
- HTTP client abstraction (`BaseHttpClient`, default `AiohttpClient`)
- CLI tool (`rheo download`)
- Full type hints

Expand Down
21 changes: 14 additions & 7 deletions docs/ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ Key pieces:

- Entry point for the library
- Orchestrates high-level download operations
- Initialises HTTP client
- Owns or accepts an injected HTTP client via `BaseHttpClient` (default `AiohttpClient`); only calls `open()`/`close()` on clients it owns
- Creates and owns all event wiring (queue and worker events to tracker)
- Creates a single shared `EventEmitter` and exposes it as an event facade via `on()` (returns `Subscription` handle)
- Delegates worker lifecycle to `WorkerPool`
Expand All @@ -104,6 +104,7 @@ Key pieces:
- Re-queues unstarted downloads during shutdown
- Maintains task lifecycle and cleanup
- Supports selective cancellation of individual downloads (queued or in-progress)
- Accepts `BaseHttpClient` for workers (manager passes the shared client)

**DownloadWorker**:

Expand Down Expand Up @@ -253,7 +254,7 @@ config4 = FileConfig(url="https://example.com/file.zip", destination_subdir="dir

### Infrastructure Layer

**What it does**: Cross-cutting concerns like logging.
**What it does**: Cross-cutting concerns like logging and HTTP client abstraction.

**Logging**:

Expand All @@ -262,7 +263,14 @@ config4 = FileConfig(url="https://example.com/file.zip", destination_subdir="dir
- Injected as dependency (testable)
- Each component gets its own logger

**Why**: Centralised, consistent logging. Easy to test without output noise.
**HTTP Client**:

- `BaseHttpClient` ABC defines open/close/closed/get contract
- Default implementation: `AiohttpClient` with SSL factory helpers
- Manager only opens/closes clients it owns; injected clients respected
- Shared across pool and workers for connection reuse

**Why**: Centralised, consistent logging and a swappable HTTP client layer for testability and future client implementations.

### CLI Layer

Expand Down Expand Up @@ -372,7 +380,7 @@ config4 = FileConfig(url="https://example.com/file.zip", destination_subdir="dir
```text
1. Worker completes file download successfully
2. If FileConfig has hash_config:
a. Worker emits worker.validation_started event
a. Worker emits `download.validating` event
b. Worker calls validator.validate(file_path, hash_config)
c. Validator calculates hash in thread pool (via asyncio.to_thread):
- Opens file in binary mode
Expand All @@ -381,16 +389,15 @@ config4 = FileConfig(url="https://example.com/file.zip", destination_subdir="dir
- Returns hexadecimal hash
d. Validator compares hashes using hmac.compare_digest (constant-time)
e. If hashes match:
- Worker emits worker.validation_completed event
- Worker emits download.completed event
f. If hashes don't match:
- Worker emits worker.validation_failed event
- Worker emits download.failed event with validation context
- Worker deletes corrupted file
- Worker raises HashMismatchError (not retried by default)
3. Tracker observes validation events, updates DownloadInfo.validation
```

Note: Validation events still use `worker.*` namespace and will be renamed to `download.*` in a future release.
Note: Validation events use the `download.*` namespace.

### Cancellation Flow

Expand Down
18 changes: 10 additions & 8 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,9 @@ That's it. The manager handles worker pools, state tracking, and cleanup automat
- **Speed & ETA tracking**: Real-time download speed with moving averages and estimated completion time
- **Graceful shutdown**: Stop downloads cleanly or cancel immediately
- **File exists handling**: Skip, overwrite, or error when destination exists (configurable per-file or globally)
- **Event system**: React to download lifecycle events (queued, started, progress, speed, completed, failed, skipped, cancelled, retry, validation)
- **Event system**: Typed events (`DownloadEventType`) with `Subscription` handles from `manager.on()` (queued, started, progress, completed, failed, skipped, cancelled, retry, validation)
- **Progress tracking**: Track bytes downloaded, completion status, errors, validation results, destination paths, and final average speeds
- **HTTP client abstraction**: `BaseHttpClient` with default `AiohttpClient` implementation
- **Async/await**: Built on asyncio for efficient I/O
- **Type hints**: Full type annotations throughout
- **Dependency injection**: Easy to test and customise
Expand Down Expand Up @@ -360,31 +361,32 @@ files = [

### Retry on Transient Errors

Automatic retry with exponential backoff - just provide a config:
Automatic retry with exponential backoff for transient errors (500, 503, timeouts, etc.).

> **Note**: Retry configuration is not yet available at the `DownloadManager` level. Currently, enabling retries requires direct worker instantiation as shown below. Manager-level retry configuration is planned for a future release.

```python
from rheo.domain.retry import RetryConfig
from rheo.downloads import RetryHandler, DownloadWorker

# Simple: just specify retry config (sensible defaults for everything else)
# Configure retry behaviour
config = RetryConfig(max_retries=3, base_delay=1.0, max_delay=60.0)
retry_handler = RetryHandler(config)

# Create worker with retry support
# Create worker with retry support (bypasses manager)
async with aiohttp.ClientSession() as session:
worker = DownloadWorker(
client=session,
retry_handler=retry_handler,
)
# Worker will automatically retry transient errors (500, 503, timeouts, etc.)
await worker.download(url, destination_path)
await worker.download(url, destination_path, download_id="my-download")
```

**Advanced**: Customize retry policy for specific status codes:
**Advanced**: Customise retry policy for specific status codes:

```python
from rheo.domain.retry import RetryConfig, RetryPolicy
from rheo.downloads import RetryHandler, ErrorCategoriser
from rheo.downloads import RetryHandler

# Custom policy - treat 404 as transient (normally permanent)
policy = RetryPolicy(
Expand Down
2 changes: 1 addition & 1 deletion docs/ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ The library and CLI are working for real use.
- Concurrent downloads with worker pool
- Priority queue system
- Selective cancellation (cancel individual downloads by ID)
- Event-driven architecture with `manager.on()`/`off()` subscription
- Event-driven architecture with typed events and `Subscription` handles from `manager.on()`
- Download tracking and state management
- Comprehensive error handling
- Retry logic with exponential backoff
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[project]
name = "rheopy"
version = "0.5.0"
version = "0.6.0"
description = "Concurrent HTTP download orchestration with async I/O"
authors = [
{name = "plutopulp", email = "plutopulped@gmail.com"}
Expand Down