Skip to content

OpenCloserOrg/BypassCaptchaForBots

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

BypassCaptchaForBots

Step-by-step tutorial and OpenClaw-ready skill for extracting booking availability from CAPTCHA-heavy pages (like Calendly) using Firecrawl.

What this repo gives you

  • A practical flow that worked in production testing
  • A reusable OpenClaw skill folder with scripts and examples
  • A Node.js CLI script with retries, timeout controls, debug logging, and query prompts
  • Guidance for wiring OpenCloser endpoints (/api/convertCalLinkToJSON, /api/LinkToText)

Firecrawl discount

You can save 10% on Firecrawl with this link:

https://firecrawl.link/openclawdev

Architecture (what worked)

Use a two-step flow:

  1. POST /v2/scrape to load the booking page and get a scrapeId
  2. POST /v2/scrape/{scrapeId}/interact with a deterministic prompt to extract times

Why: plain scrape often shows CAPTCHA/reCAPTCHA text but not reliable slots. Interact can click and read rendered UI.

Key hurdles we hit today

  • Interact timeout after successful scrape
  • Output format drift from strict YYYY-MM-DD | times
  • Runtime cap too low at 20s for real pages

Fixes that improved reliability

  • Increase max runtime to 45s+
  • Split timeouts:
    • scrape timeout: ~20s
    • interact timeout: ~30s
  • Add retry for interact timeout with simpler fallback prompt
  • Add parser fallback for alternate response shapes

OpenClaw setup

  1. Clone this repo in your workspace
  2. Configure environment:
export FIRECRAWL_API_KEY="fc-..."
  1. Run script:
node skills/firecrawl-calendly-bypass/scripts/fc-calendly-query.js \
  --url "https://calendly.com/mike-opencloser/30min" \
  --max-runtime-ms 45000

YouTube quickstart (copy/paste)

git clone https://github.com/OpenCloserOrg/BypassCaptchaForBots.git
cd BypassCaptchaForBots
export FIRECRAWL_API_KEY="fc-..."
node skills/firecrawl-calendly-bypass/scripts/fc-calendly-query.js --url "https://calendly.com/mike-opencloser/30min"

Use a custom extraction instruction:

node skills/firecrawl-calendly-bypass/scripts/fc-calendly-query.js \
  --url "https://calendly.com/mike-opencloser/30min" \
  --query "From the visible booking calendar, list next 7 days with times." \
  --max-runtime-ms 60000 --scrape-timeout-ms 20000 --interact-timeout-ms 30000 --retries 1

Packaged skill artifact

Already included:

  • dist/firecrawl-calendly-bypass.skill

You can import that skill package directly into supported OpenClaw skill flows.

OpenCloser endpoint pattern

Recommended endpoint behavior:

  • /api/convertCalLinkToJSON?url=<booking-url>&internalKey=...

    • scrape
    • interact
    • parse output
    • return structured JSON
  • /api/LinkToText?url=<booking-url>&internalKey=...

    • scrape markdown only
    • return readable text + metadata

See example implementation:

  • skills/firecrawl-calendly-bypass/examples/example-opencloser-endpoint.js

Safety notes

  • Do not commit API keys or internal keys
  • Pass secrets via env vars
  • Rotate keys that were ever pasted into public logs/chats

YouTube demo outline (quick)

  1. Explain scrape vs interact
  2. Show timeout split and retries
  3. Run CLI script with logs
  4. Show JSON response with next slots
  5. Show fallback parser behavior
  6. Show endpoint integration in OpenCloser

About

BypassCaptcha

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors