Releases: RaySatish/Market-Surveillance-System
Releases · RaySatish/Market-Surveillance-System
v1.0.0 — SoftwareX Submission
Market Surveillance for Trade Abuse Detection — v1.0.0
First stable release, accompanying the SoftwareX paper submission.
What It Does
Real-time market surveillance pipeline that detects wash trading and pump & dump schemes in live cryptocurrency trade data from Binance. Processes 10,000+ trades/min with sub-second alert latency.
Architecture
Binance WebSocket → Kafka → Spark Structured Streaming → Kafka Alert Topics → PostgreSQL → Streamlit Dashboard
Detection
- Wash Trade Detector — Cross-window Z-score on volume per symbol using tumbling time windows
- Pump & Dump Detector — Stateful pattern matching: PUMP (price rise + volume surge) → DUMP (price reversal within configurable window)
Key Features
- 🔴 Live Binance WebSocket ingestion (BTCUSDT, ETHUSDT, SOLUSDT) with REST backfill on reconnect
- ⚡ Spark Structured Streaming for both detectors with exactly-once checkpointing
- 🗄️ PostgreSQL persistence with UPSERT-based deduplication
- 📊 Streamlit dashboard with live auto-refresh, severity filtering, and alert timeline charts
- 🧪 Synthetic test mode with injected abuse patterns (
--test) - 🔁 One-command reproducibility:
bash reproduce.sh - 📈 Sensitivity sweep tables for threshold tuning (wash + pump/dump)
- 🛡️ Fault tolerance: dead-letter queue, structured logging, trade validation
Quick Start
# Prerequisites: Python 3.10+, Docker Desktop, Java 11+
pip install -r requirements.txt
# Run everything (real Binance data)
bash reproduce.sh
# Or with synthetic test data
bash reproduce.sh --test
# Stop
bash reproduce.sh --stopTech Stack
| Component | Technology |
|---|---|
| Ingestion | Binance WebSocket + REST API |
| Messaging | Apache Kafka (Docker) |
| Processing | Apache Spark Structured Streaming (PySpark 3.5) |
| Storage | PostgreSQL (Docker) |
| Dashboard | Streamlit + Plotly |
| Language | Python 3.10+ |
Codebase
- 10 Python modules, ~4,700 lines of code
- MIT License