Skip to content

feat(proxy): add ProxyRotator with multi-strategy rotation, health tracking, and auto-failover#38

Open
evelaa123 wants to merge 1 commit intoCloakHQ:mainfrom
evelaa123:feature/proxy-rotation
Open

feat(proxy): add ProxyRotator with multi-strategy rotation, health tracking, and auto-failover#38
evelaa123 wants to merge 1 commit intoCloakHQ:mainfrom
evelaa123:feature/proxy-rotation

Conversation

@evelaa123
Copy link
Contributor

Summary

Adds ProxyRotator — a thread-safe proxy pool with automatic rotation, health tracking, and failover for both Python and JavaScript/TypeScript.

Background

The design was inspired by Scrapling's ProxyRotator, which provides cyclic proxy rotation with custom strategy callbacks for its spider framework. CloakBrowser's implementation is written from scratch and differs significantly:

  • Built-in strategies instead of callback-based: four strategies are built in (round-robin, random, least-used, least-failures) rather than requiring the user to write a strategy function.
  • Health tracking with automatic cooldown: Scrapling's rotator is stateless — it just cycles through a list. Ours tracks per-proxy failure counts, consecutive failures, and automatically sidelines broken proxies for a configurable cooldown period, then recovers them.
  • Sticky sessions: Scrapling has no concept of reusing the same proxy for N consecutive requests. Ours supports sticky_count to keep the same proxy for a batch of requests (useful for session-based websites).
  • Dynamic pool management: proxies can be added/removed at runtime without restarting. Scrapling's list is static.
  • Thread safety: all operations are protected by threading.Lock (Python). Scrapling's rotator is not thread-safe.
  • Credential masking: stats() and all log output automatically mask usernames and passwords. Scrapling does not mask credentials.
  • SOCKS5 validation: rejects socks5://user:pass@host:port at construction time with a clear error message, because Chromium does not support SOCKS5 authentication. This prevents a confusing launch-time crash.
  • Bare proxy format: accepts user:pass@host:port without a scheme prefix, which is common in proxy provider dashboards.
  • Dual implementation: both Python and TypeScript, matching CloakBrowser's architecture. Scrapling is Python-only.
  • Context manager / withSession: auto-reports success on normal return, failure on exception. Scrapling relies on the spider framework's retry system instead.

In short: Scrapling's ProxyRotator was the starting inspiration for "a class that rotates proxies," but the actual implementation, API, and feature set are entirely different and written from scratch for CloakBrowser's use case (standalone browser automation, not a spider framework).

Features

Rotation Strategies

  • round-robin (default): cycles through proxies in order, skipping any that are in cooldown
  • random: picks a random available proxy each time
  • least-used: picks the proxy with the fewest total uses (distributes load evenly)
  • least-failures: picks the proxy with the fewest total failures (avoids unreliable proxies)

Health Tracking

Each proxy tracks: use_count, fail_count, consecutive_fails, last_used, last_failed, cooldown_until.

When a proxy reaches max_failures consecutive failures (default: 3), it is automatically placed on cooldown for cooldown seconds (default: 300). During cooldown it is skipped by all strategies. When cooldown expires, the proxy becomes available again. Calling report_success() immediately resets the consecutive failure counter and removes any cooldown.

If all proxies are in cooldown simultaneously, next() raises RuntimeError with a clear message.

Sticky Sessions

Set sticky_count=N to reuse the same proxy for N consecutive next() calls before rotating. Useful for websites that track sessions by IP. If the sticky proxy fails, rotation is forced immediately.

Dynamic Pool

  • add(proxy) — add a proxy to the pool at runtime
  • remove(proxy) — remove a proxy (raises if not found or would empty the pool)
  • Round-robin index is clamped after removal to prevent index errors
  • Sticky state is cleared if the sticky proxy is removed

Context Manager / withSession

Python:

with rotator.session() as proxy:
    browser = launch(proxy=proxy)
    # if this block raises, report_failure() is called automatically
    # if it completes normally, report_success() is called

JavaScript:

await rotator.withSession(async (proxy) => {
    const browser = await launch({ proxy });
    // same auto-reporting behavior
});

Credential Masking

stats() output and all internal logging automatically masks credentials:

  • http://user:pass@host:8080http://***:***@host:8080
  • user:pass@host:8080 (bare format) → ***:***@host:8080
  • socks5://host:8080 (no creds) → unchanged
  • Dict key format http://host:8080||usernamehttp://host:8080||***

SOCKS5 Validation

Chromium does not support SOCKS5 proxy authentication. Rather than failing with a cryptic error at browser launch time, ProxyRotator validates proxies at construction and add() time:

  • socks5://host:port — accepted (no auth)
  • socks5://user:pass@host:port — raises ValueError immediately
  • {"server": "socks5://host:port", "username": "u", "password": "p"} — raises ValueError immediately

Integration with launch()

from cloakbrowser import ProxyRotator, launch

rotator = ProxyRotator([
    "http://user:pass@proxy1:8080",
    "http://user:pass@proxy2:8080",
], strategy="round_robin")

browser = launch(proxy=rotator.next())
page = browser.new_page()
page.goto("https://example.com")
rotator.report_success(rotator.current())
browser.close()

The current() method returns the proxy that was last selected by next(), so you can report success/failure after using it.

API Reference

Python (cloakbrowser.proxy_rotator)

class ProxyRotator:
    def __init__(self, proxies, strategy="round_robin", cooldown=300.0, max_failures=3, sticky_count=1)
    def next(self) -> str | dict
    def current(self) -> str | dict | None
    def report_success(self, proxy) -> None
    def report_failure(self, proxy) -> None
    def session(self) -> ContextManager[str | dict]
    def stats(self) -> list[dict]
    def reset(self) -> None
    def add(self, proxy) -> None
    def remove(self, proxy) -> None
    @property
    def available_count(self) -> int

JavaScript/TypeScript (cloakbrowser)

class ProxyRotator {
    constructor(proxies: ProxyValue[], options?: ProxyRotatorOptions)
    next(): ProxyValue
    current(): ProxyValue | null
    reportSuccess(proxy: ProxyValue): void
    reportFailure(proxy: ProxyValue): void
    withSession<T>(fn: (proxy: ProxyValue) => Promise<T>): Promise<T>
    stats(): ProxyStats[]
    reset(): void
    add(proxy: ProxyValue): void
    remove(proxy: ProxyValue): void
    get availableCount(): number
    get size(): number
    static proxyKey(proxy: ProxyValue): string
}

Tests

Suite Count Status
Python unit (pytest) 49
JS unit (vitest) 45
Python real-proxy integration 7
JS real-proxy integration 9
Total 110

Tests cover: all four strategies, health tracking with cooldown expiration, sticky sessions, context manager success/failure paths, dynamic pool add/remove edge cases, credential masking for all proxy formats, SOCKS5 validation, thread safety (10 threads × 100 calls), bare proxy format handling, round-robin index clamping after removal, and real browser integration (headless Chromium verifying external IP via ipify.org).

Files

Added

  • cloakbrowser/proxy_rotator.py — Python implementation
  • js/src/proxy-rotator.ts — TypeScript implementation
  • tests/test_proxy_rotator.py — Python unit tests (49 tests)
  • js/tests/proxy-rotator.test.ts — JS unit tests (45 tests)
  • tests/test_proxy_real.py — Python real-proxy integration tests
  • tests/test_proxy_real.mjs — JS real-proxy integration tests
  • examples/proxy_rotation.py — Python usage example
  • examples/proxy_verify.py — Python proxy verification script
  • js/examples/proxy-rotation.ts — TypeScript usage example

Modified

  • cloakbrowser/__init__.py — export ProxyRotator
  • cloakbrowser/browser.py — integrate rotator with launch()
  • js/src/index.ts — export ProxyRotator, maskProxy, resolveProxyRotator
  • js/src/proxy.ts — integrate rotator resolution
  • js/src/types.ts — add ProxyRotator-related types
  • js/src/playwright.ts — resolve rotator in Playwright launcher
  • js/src/puppeteer.ts — resolve rotator in Puppeteer launcher
  • tests/test_proxy.py — add bare proxy format parsing tests

…acking, and auto-failover

- Round-robin, random, least-used, least-failures strategies
- Per-proxy health tracking with cooldown and auto-recovery
- Sticky sessions (reuse same proxy for N requests)
- Dynamic pool management (add/remove at runtime)
- Context manager / withSession with auto success/failure reporting
- Thread-safe implementation with Lock
- Credential masking in stats and logs
- SOCKS5 without auth supported; SOCKS5 with auth rejected (Chromium limitation)
- Bare proxy format support (user:pass@host:port)
- Full test coverage: 49 pytest, 45 vitest, 7 real-proxy, 9 real-proxy JS
- Integration with launch(proxy=rotator)
@Durafen Durafen force-pushed the main branch 2 times, most recently from e062dcc to 0aa4ea5 Compare March 12, 2026 00:59
@Cloak-HQ
Copy link
Contributor

Hey, thanks for the detailed PR and the solid test coverage — this is clearly well thought out.

After reviewing, we're going to pass on merging this one. The core reason: proxy rotation is orchestration logic that lives between browser sessions, not inside the launcher. Since Chrome binds the proxy at process start (it's a --proxy-server flag — there's no mid-session swap), the rotator's job is purely deciding what to pass into the next launch() call. That's the caller's responsibility, not the wrapper's.

Compare this to humanize, which patches browser behavior during a session and can't exist outside of it. The rotator can — it would work identically as a standalone utility or even just a few lines in user code.

We want to keep the wrapper focused on what only it can do: launching a stealthy browser. For multi-session workflows, check out CloakBrowser Manager — self-hosted profile management with persistent sessions, proxies, and noVNC.

One thing we do want to take from this PR: the bare proxy format fix (user:pass@host:port without a scheme). We've already landed that separately. Thanks for surfacing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants