Ctrip flight harvesting skill for logged-in desktop Chrome sessions, with browser-driven collection, SQLite caching, and conservative planner defaults for domestic and international routes.
- drives the real Ctrip browser flow instead of Playwright or direct HTTP scraping
- harvests one-way and round-trip flight result pages from desktop Chrome
- supports route freshness checks through a shared SQLite cache
- exports raw and normalized JSON for downstream analysis
- includes Claude-compatible agent metadata under
.claude/agents/
SKILL.md: primary skill instructions.claude/agents/flight-ghost-harvest.md: Anthropic-compatible agent entrypointscripts/plan_harvest_run.py: unified planner for freshness-aware harvest runsscripts/harvest_ctrip_chrome.py: browser-side Ctrip result extractionscripts/check_db_freshness.py: route freshness lookupscripts/sync_flights_db.py: SQLite sync for normalized outputreferences/database-guide.md: how the agent should read the shared database
-
Ask the user for route and, unless they explicitly want one-way only, approximate stay length.
-
Check the shared SQLite cache:
python3 scripts/check_db_freshness.py SHA TYOA
-
Plan or execute the harvest:
python3 scripts/plan_harvest_run.py SHA TYOA --stay-days 5 python3 scripts/plan_harvest_run.py SHA TYOA --stay-days 5 --execute
-
Inspect raw or normalized output under
out/.
- The planner defaults to conservative browser actions:
--expand-limit 0and--scroll-steps 0. - The shared SQLite database lives at
~/.local/share/flight-harvest/flights.db. - For direct database inspection, start with
references/database-guide.md.