A full-stack web scraping and data visualization platform that automatically collects website data, stores unique entries, and displays them on an interactive map with real-time filtering.
- 🕷️ Puppeteer-powered scraping of listed websites
- 🔄 Node-cron scheduled jobs for periodic updates
- 🧹 Duplicate prevention with unique data hashing
- 📊 Pagination & filters for efficient browsing
- 🗺️ Dynamic map view with color-coded status indicators
- 🔴⚫ Real-time status toggles (red → black)
- 📍 Case detail deep-linking with URL persistence
- 📱 Fully responsive Tailwind CSS layout
| Tech | Usage |
|---|---|
| Next.js 15 | App Router |
| TypeScript | Type-safe development |
| Tailwind CSS | Utility-first styling |
| shadcn/ui | Beautifully designed components |
| React Query | Data fetching & caching |
| Leaflet | Interactive map visualization |
| Tech | Usage |
|---|---|
| Express.js | REST API server |
| Node.js | Runtime environment |
| Puppeteer | Web scraping automation |
| Node-cron | Scheduled tasks |
| MongoDB | Data persistence |
webscrape-analytics/
├── .github/
│ └── workflows/
│ └── lint.yml # CI/CD pipeline
├── .vscode/ # IDE settings
├── client/ # Next.js frontend
│ ├── app/
│ │ ├── page.tsx / # Case listings
│ │ ├── loading.tsx / # Handle Stream Loading
│ │ ├── global.css / # Global CSS
│ │ └── layout.tsx
│ ├── components/ # UI components
│ ├── lib/ # Utilities
├── server/ # Express backend
│ ├── src/
│ │ ├── config/ # Config
│ │ ├── scheduler.ts/ # Cron jobs
│ │ ├── bot.ts/ # All automation works
│ │ ├── app.ts/ # Application Entry point
│ │ └── types/ # Type safety
│ └── index.ts # Server entry
├── pnpm-workspace.yaml # Monorepo config
└── package.json # Root dependenciesimport cron from 'node-cron';
import puppeteer from 'puppeteer';
cron.schedule('0 */6 * * *', async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://target-site.com');
const data = await page.evaluate(() => {
return Array.from(document.querySelectorAll('.items'))
.map(el => ({
title: el.querySelector('h2')?.textContent?.trim(),
url: el.querySelector('a')?.href
}));
});
await processAndStore(data); // Dedupe & save
await browser.close();
});Node.js v20+
MongoDB
PNPM (npm install -g pnpm)
Next.js
Typescript
Puppeteergit clone https://github.com/jsdevrazuislam/Murder-News-Map
cd Murder-News-Map1. Create .env in /server:
MONGO_URI=mongodb://localhost:27017/webscrape
GEMINI_API_KEY=GEMINI_API_KEY
PORT=8000
2. Create .env.local in /client:
NEXT_PUBLIC_URL=http://localhost:3001
NEXT_PUBLIC_MAP_BOX_ACCESS_TOKEN=your_key_herepnpm install
pnpm start