Skip to content

Feature request: Playwright-based auth provider for headless/WSL environments #127

@ReviveBusiness

Description

@ReviveBusiness

Problem

The current authentication pipeline relies on Chrome DevTools Protocol (CDP) to extract cookies from a running Chrome instance. This works when Chrome is running on the same machine with --remote-debugging-port=9222, but breaks in several common scenarios:

  • WSL users: Chrome runs on Windows, CDP must cross the WSL/Windows boundary via 127.0.0.1:9222. If Chrome is restarted without the debug flags (Windows Update, user launches from shortcut, etc.), CDP becomes unreachable and auth is dead until manual intervention.
  • Headless Linux servers: No desktop Chrome available at all.
  • CI/CD environments: No browser to connect to.

The Chrome extension + Native Messaging Host pipeline is also fragile — it requires a .bat wrapper, registry keys, and a VBS startup script, all of which can break independently.

Proposed Solution: Playwright Persistent Browser Context

Playwright for Python supports persistent browser contexts that store cookies, localStorage, and sessionStorage in a profile directory. This enables:

  1. One-time interactive login: Launch headed Chromium, user authenticates to Google normally, persistent context saved to disk.
  2. Headless refresh: Cron job (every 15 min) opens the persistent context headlessly, navigates to notebooklm.google.com, extracts fresh cookies via context.cookies(), and writes them to the nlm profile directory.

Key advantages over CDP:

  • Self-contained: Browser runs in the same environment as nlm — no cross-boundary communication
  • Survives reboots: Persistent context is on disk; only needs re-login if Google session fully expires
  • No Chrome flags required: Uses its own Chromium instance, not the user's Chrome
  • Cookie extraction is trivial: context.cookies() returns all cookies, no CDP connection needed

Implementation pattern:

from playwright.sync_api import sync_playwright

CONTEXT_DIR = "~/.config/nlm-auth/browser-context/"

with sync_playwright() as p:
    # First run: headed (user logs in)
    # Subsequent runs: headless (cron refresh)
    browser = p.chromium.launch_persistent_context(
        CONTEXT_DIR,
        headless=True,  # False for first login
    )
    page = browser.pages[0] if browser.pages else browser.new_page()
    page.goto("https://notebooklm.google.com")
    page.wait_for_load_state("networkidle")
    
    # Extract cookies
    cookies = browser.cookies(["https://notebooklm.google.com"])
    # Filter for auth cookies: SID, HSID, SSID, APISID, SAPISID
    
    # Extract CSRF token from page
    html = page.content()
    # Regex: r'"SNlM0e":"([^"]+)"'
    
    # Write to profiles/default/cookies.json + metadata.json
    browser.close()

Cron entry:

*/15 * * * * playwright-python nlm-auth-refresh.py >> /tmp/nlm-auth-refresh.log 2>&1

Related

  • The auth.json file is required by base.py Layer 2 recovery (_try_reload_or_headless_auth). Any refresh daemon needs to write auth.json in addition to cookies.json — without it, Layer 2 is silently skipped even when valid cookies exist on disk.

Scope

This is a feature request / discussion — not a PR. The pattern above has been validated in a WSL2 environment and keeps auth alive indefinitely with zero manual intervention after the initial login.

Would you consider adding Playwright as a built-in auth provider (alongside builtin and openclaw)? Happy to discuss implementation details.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions