Skip to content

jsdevrazuislam/Murder-News-Map

Repository files navigation

WebScrape Analytics Dashboard

Dashboard Preview

A full-stack web scraping and data visualization platform that automatically collects website data, stores unique entries, and displays them on an interactive map with real-time filtering.

Key Features

Automated Data Pipeline

  • 🕷️ Puppeteer-powered scraping of listed websites
  • 🔄 Node-cron scheduled jobs for periodic updates
  • 🧹 Duplicate prevention with unique data hashing
  • 📊 Pagination & filters for efficient browsing

Interactive Visualization

  • 🗺️ Dynamic map view with color-coded status indicators
  • 🔴⚫ Real-time status toggles (red → black)
  • 📍 Case detail deep-linking with URL persistence
  • 📱 Fully responsive Tailwind CSS layout

Tech Stack

Frontend

Tech Usage
Next.js 15 App Router
TypeScript Type-safe development
Tailwind CSS Utility-first styling
shadcn/ui Beautifully designed components
React Query Data fetching & caching
Leaflet Interactive map visualization

Backend

Tech Usage
Express.js REST API server
Node.js Runtime environment
Puppeteer Web scraping automation
Node-cron Scheduled tasks
MongoDB Data persistence

Project Structure (Monorepo)

webscrape-analytics/
├── .github/
│ └── workflows/
│ └── lint.yml # CI/CD pipeline
├── .vscode/ # IDE settings
├── client/ # Next.js frontend
│ ├── app/
│ │ ├── page.tsx / # Case listings
│ │ ├── loading.tsx / # Handle Stream Loading
│ │ ├── global.css / #  Global CSS
│ │ └── layout.tsx
│ ├── components/ # UI components
│ ├── lib/ # Utilities
├── server/ # Express backend
│ ├── src/
│ │ ├── config/ # Config 
│ │ ├── scheduler.ts/ # Cron jobs
│ │ ├── bot.ts/ # All automation works
│ │ ├── app.ts/ # Application Entry point
│ │ └── types/ # Type safety
│ └── index.ts # Server entry
├── pnpm-workspace.yaml # Monorepo config
└── package.json # Root dependencies

Core Functionality

Scraping Workflow

import cron from 'node-cron';
import puppeteer from 'puppeteer';

cron.schedule('0 */6 * * *', async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  
  await page.goto('https://target-site.com');
  const data = await page.evaluate(() => {
    return Array.from(document.querySelectorAll('.items'))
      .map(el => ({
        title: el.querySelector('h2')?.textContent?.trim(),
        url: el.querySelector('a')?.href
      }));
  });

  await processAndStore(data); // Dedupe & save
  await browser.close();
});

Getting Started

Prerequisites

Node.js v20+
MongoDB
PNPM (npm install -g pnpm)
Next.js
Typescript
Puppeteer

Installation

git clone https://github.com/jsdevrazuislam/Murder-News-Map
cd Murder-News-Map

Environment Setup

1. Create .env in /server:
MONGO_URI=mongodb://localhost:27017/webscrape
GEMINI_API_KEY=GEMINI_API_KEY
PORT=8000

2. Create .env.local in /client:
NEXT_PUBLIC_URL=http://localhost:3001
NEXT_PUBLIC_MAP_BOX_ACCESS_TOKEN=your_key_here

Running the App

pnpm install
pnpm start