Designed for NotebookLM workflows. Automatically extract all links from any web page — no more manual copy-paste!
- 🤖 Puppeteer-powered — Uses a real Chrome browser to render pages, fully supports React / Next.js / Vue SPAs
- 📑 Tab content extraction — Automatically clicks tabs to reveal hidden content
- 🏷️ Smart categorization — Auto-classifies links as YouTube, Google Docs/Drive, GitHub, PDF, Social, Homepage, Internal, or External
- 🌊 Deep crawl — Supports depth 1 / 2 / 3 recursive sub-page crawling
- 🔍 Real-time search & filter — Quickly filter by title or URL
- 📋 One-click copy — Copy a single link, title + link, or all links at once
- 📥 Markdown export — Generate structured documents ready for NotebookLM
- 🌐 Bilingual UI — Supports both English and Chinese interfaces
- Node.js v18 or higher
git clone https://github.com/bevantu/PageFinder.git
cd PageFinder
npm install
⚠️ The firstnpm installwill download Puppeteer's bundled Chromium (~300MB). Please be patient.
node server.jsThen open http://localhost:3737 in your browser.
💡 The first request launches the built-in Chrome (~3-5 seconds). Subsequent requests are faster.
- Enter a URL — Paste the page URL you want to extract links from
- Choose crawl depth:
- Page only — Analyze only the current page (fastest)
- Depth 2/3 — Recursively enter sub-pages (great for course index or syllabus pages)
- Content type filter — Pre-select which link types you want (multi-select, applied before crawl)
- Quick preview — Single-page fast mode
- Deep crawl — Full multi-level crawl mode
Click Export to generate a .md file for direct upload to NotebookLM, or click Copy All to copy all URLs to your clipboard.
| Layer | Technology |
|---|---|
| Backend | Node.js + Express |
| Crawler | Puppeteer (Chromium) + Axios + Cheerio |
| Frontend | HTML / Vanilla CSS / Vanilla JS |
| Design | Dark UI + Glassmorphism |
PageFinder/
├── server.js # Backend: crawler + API
├── package.json
├── public/
│ ├── index.html # Frontend page
│ ├── style.css # Styles
│ └── app.js # Frontend logic
├── README.md # English README
└── README_CN.md # Chinese README
MIT © 2026