Skip to content

Unexpected browser termination causing improper Circuit Breaker activation #3

@LXSCA7

Description

@LXSCA7

Context: The Playwright engine intermittently reports that the browser is closed unexpectedly. This triggers the Circuit Breaker prematurely, disrupting the scraping flow even when the network is stable.

Goal: Implement logic to distinguish between a Process Crash and a Timeout. The Circuit Breaker should primarily be "tripped" on Timeouts or network-related failures, indicating that the target might be throttling us or the network is unstable.

  • Steps to Investigate & Implement:
    • Error Classification: Update the engine to parse Playwright errors and identify TimeoutError specifically.
    • Conditional Tripping: Modify the Circuit Breaker logic to only increment the failure count on timeouts/network errors.
    • Process Recovery: Implement a silent restart/retry for pure browser crashes that aren't related to timeouts.
    • Resource Monitoring: Check if high-volume scraping is causing memory leaks that lead to these crashes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions