Skip to content

Deep Dive: Python + FastAPI + Playwright #7

@chhot2u

Description

@chhot2u

Python + FastAPI + Playwright — Deep Dive Analysis

Overview

Python async backend with FastAPI for the API layer, Playwright for Python for browser automation, and a React/Vue SPA frontend. Great developer experience with Python's rich ecosystem.


Architecture

┌─────────────────────────────────────────────┐
│          Frontend (React/Vue SPA)           │
│  - Task dashboard with real-time updates    │
│  - WebSocket for live screenshots           │
│  - Configuration panels                     │
├─────────────────────────────────────────────┤
│          FastAPI Server                     │
│  ┌───────────────────────────────────────┐  │
│  │   REST API + WebSocket endpoints      │  │
│  │   - OAuth2 / JWT authentication       │  │
│  │   - Pydantic validation               │  │
│  │   - Background task management        │  │
│  ├───────────────────────────────────────┤  │
│  │   Task Queue                          │  │
│  │   - asyncio.Queue (simple)            │  │
│  │   - OR Celery + Redis (distributed)   │  │
│  │   - OR arq (async Redis queue)        │  │
│  ├───────────────────────────────────────┤  │
│  │   Playwright Worker Pool              │  │
│  │   - playwright-python async API       │  │
│  │   - 1 Browser, 100 contexts           │  │
│  │   - Per-context proxy                 │  │
│  ├───────────────────────────────────────┤  │
│  │   Data Layer                          │  │
│  │   - SQLAlchemy + PostgreSQL           │  │
│  │   - Redis (cache/queue)              │  │
│  │   - S3 for artifacts                 │  │
│  └───────────────────────────────────────┘  │
└─────────────────────────────────────────────┘

Key Dependencies

# pyproject.toml
[project]
dependencies = [
    "fastapi>=0.115",
    "uvicorn[standard]>=0.32",
    "playwright>=1.49",
    "sqlalchemy[asyncio]>=2.0",
    "asyncpg>=0.30",
    "redis>=5.2",
    "pydantic>=2.10",
    "python-jose[cryptography]>=3.3",
]

Proxy per Task Implementation

from playwright.async_api import async_playwright

class TaskRunner:
    def __init__(self):
        self.browser = None
        self.semaphore = asyncio.Semaphore(100)

    async def init(self):
        pw = await async_playwright().start()
        self.browser = await pw.chromium.launch(headless=True)

    async def run_task(self, task: TaskConfig) -> TaskResult:
        async with self.semaphore:
            context = await self.browser.new_context(
                proxy={
                    "server": task.proxy.server,
                    "username": task.proxy.username,
                    "password": task.proxy.password,
                },
                locale=task.proxy.locale,
                timezone_id=task.proxy.timezone,
            )
            page = await context.new_page()

            try:
                results = []
                for step in task.steps:
                    result = await self.execute_step(page, step)
                    results.append(result)
                return TaskResult(success=True, data=results)
            except Exception as e:
                return TaskResult(success=False, error=str(e))
            finally:
                await context.close()

Concurrency Model

async def run_batch(tasks: list[TaskConfig]) -> list[TaskResult]:
    runner = TaskRunner()
    await runner.init()

    # Run all tasks concurrently (semaphore limits to 100)
    results = await asyncio.gather(
        *[runner.run_task(task) for task in tasks],
        return_exceptions=True,
    )
    return results

Strengths

  • Easiest to learn: Python + FastAPI is very approachable
  • Rich ecosystem: pandas, BeautifulSoup, ML libraries for data processing
  • Pydantic validation: Automatic request/response validation
  • Async native: FastAPI + Playwright both async
  • Auto-generated docs: Swagger UI out of the box
  • Playwright features: Full API — auto-wait, selectors, network interception
  • Data processing: Python excels at transforming extracted data
  • Rapid prototyping: Fastest to get a working prototype

Weaknesses

  • GIL limitations: CPU-bound tasks limited (use multiprocessing)
  • Higher memory per task: Python objects are heavier than Go/Rust
  • Slower than compiled: 3-10x slower than Go/Rust for CPU work
  • Deployment complexity: Python environments can be fragile
  • Not a desktop app: Server-only (use PyInstaller for desktop, but heavy)
  • Type checking optional: mypy helps but not enforced at runtime

Resource Estimates (100 tasks)

Resource Estimate
RAM ~2.5 GB
CPU 4-8 cores recommended
Disk ~50MB app + Playwright browsers
Startup ~2s
Redis ~50MB

When to Choose This Stack

✅ Team knows Python best
✅ Need data processing after extraction (pandas, ML)
✅ Want fastest prototype to working product
✅ Need auto-generated API docs
Server deployment is fine (no desktop requirement)

❌ Avoid if: need desktop app, CPU-bound processing, or lowest possible latency


Verdict: 7.40/10

Best developer experience and fastest to prototype. Ideal for Python teams.

References issue #1 for full comparison

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions