Skip to content

Increase performance with dynamic concurrent downloads #2

@joeguilmette

Description

@joeguilmette

Proposal: Adaptive Concurrency + Retry Support

1. Dynamic concurrency controller
Start file downloads with a conservative limit (e.g., 2 workers).
Every few seconds, check aggregate throughput (MB/s). If the rate is still rising and no errors have occurred, increment concurrency by 1 (cap at a sensible maximum, say 16 or a server-provided limit).
When throughput plateaus or error rate rises (HTTP 429/5xx, timeouts), back off: halve the worker count or subtract a few slots, and slowly ramp up again once conditions stabilize (AIMD style).
Continue monitoring so we sustain the highest stable MB/s instead of a fixed user-supplied number.

2. Retry handling
Track individual file/batch failures and automatically requeue them a few times (with exponential backoff) before giving up.
Same for DB export: retry failed chunk/process calls and the final SQL download.
Only abort the entire run if retries exhaust or a fatal error occurs; otherwise return success with a list of files that ultimately failed (if any).
This gives us two improvements: the CLI self-tunes to the fastest safe concurrency and failures are handled gracefully instead of forcing a full rerun. Let me know if you want implementation details or a phased rollout plan.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions