Add JS-render fallback for URL ingest when static fetch indexes blocked/interstitial pages

## Problem
Some URL ingests are currently reported as successful but index blocked/interstitial text instead of article content.

Observed with Reuters URL:
- https://www.reuters.com/technology/chinas-ai-startup-zhipu-releases-new-flagship-model-glm-5-2026-02-11/

Current behavior:
- `raged ingest --url <reuters-url> --collection docs` reports success.
- Stored chunk text is `Please enable JS and disable any ad blocker`.
- Reader-proxy retry (`r.jina.ai`) can also be blocked (HTTP 451), and that blocked payload gets indexed.

Impact:
- Retrieval quality is poor because embeddings are generated from blocked text.
- This looks like "ingestion did not land" to users, even when rows/chunks exist.

## Reproduction
1. Ingest the Reuters URL above.
2. Query for terms like `zhipu glm 5 reuters`.
3. Inspect stored chunk text: interstitial/blocked content instead of article body.

## Proposal
Implement URL-ingest fallback strategy:
1. Fast path: current static HTTP fetch.
2. Blocked-content detection heuristics (e.g. `enable JS`, `disable ad blocker`, anti-bot templates, too-short boilerplate text).
3. Fallback renderer: Playwright for flagged pages.
4. If still blocked: mark ingestion as blocked/unreadable with explicit reason and avoid embedding junk content.

## Acceptance Criteria
- JS-rendered pages are ingested with meaningful text when accessible.
- If blocked, ingestion records explicit blocked status/reason.
- Feature is gated by config/env flag to avoid global overhead.
- Observability includes fetch mode (`static` | `playwright`), detection reason, and extracted text length.
- Tests cover static success, blocked-page detection, and fallback success/failure.

## Notes
- Prefer Playwright over Selenium for Node ecosystem fit.
- Respect robots/site terms and applicable legal constraints.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add JS-render fallback for URL ingest when static fetch indexes blocked/interstitial pages #85

Problem

Reproduction

Proposal

Acceptance Criteria

Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add JS-render fallback for URL ingest when static fetch indexes blocked/interstitial pages #85

Description

Problem

Reproduction

Proposal

Acceptance Criteria

Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions