A production-ready data warehouse for quantitative finance analysis. Unified platform combining equity markets, fixed-income Treasuries, and ML-powered sentiment analysis. Built with DuckDB for blazing-fast analytical queries.
- β‘ Fast Analytics: DuckDB columnar storage optimized for OLAP workloads
- π Technical Indicators: Pre-computed RSI, VWAP, returns, and volume analytics
- πΉ Real-time Data: 6,000+ OHLCV bars and 125,000+ trade records
- π Complete ETL Pipeline: Automated data generation, validation, and loading
- π Treasury Yields: US 2Y, 5Y, 10Y, 30Y yield curves
- π° ETF Tracking: TLT, IEF, SHY, LQD, HYG bond ETFs
- π Spread Analysis: Yield curve inversion detection
- π Correlations: Treasury-ETF relationship analysis
- π° News Analysis: ML-powered sentiment scoring (FinBERT)
- π― Trading Signals: Sentiment-based buy/sell signals
- π Event Detection: FOMC, CPI, NFP high-impact news
- π Performance Tracking: Signal backtesting and P&L
- π Unified REST API: Single API serving all data types
- β Production Ready: Comprehensive tests, CI/CD, logging
- π No API Keys Needed: Synthetic data generation for offline development
- π Easy Integration: Connects with sentiment analysis projects
# Clone and setup
git clone https://github.com/King0508/market-data-analytics-db.git
cd market-data-analytics-db
python -m venv venv
venv\Scripts\activate # Windows (on Mac/Linux: source venv/bin/activate)
pip install -r requirements.txt
# Generate equity market data
python etl/generate_data.py
python etl/load_data.py
# Generate Treasury & fixed-income data
python etl/generate_treasury_data.py --days 365
# Run equity analytics
python analytics/run_analysis.py
# Start unified REST API
python -m api.main
# Visit http://localhost:8000/docs for interactive API documentation# Test API (in another terminal)
curl http://localhost:8000/
# Should return:
{
"name": "Quantitative Finance Data Warehouse API",
"version": "2.0.0",
"features": {
"equity_market_data": true,
"treasury_fixed_income": true,
"sentiment_analysis": true
}
}- symbols - Security master data (ticker, name, sector, market cap)
- bars - OHLCV price data with 6,000+ daily bars
- trades - 125,000+ individual trade records
- treasury_yields - US Treasury yields (2Y, 5Y, 10Y, 30Y) with 1,000+ records
- fixed_income_etfs - Bond ETF prices (TLT, IEF, SHY, LQD, HYG) with 1,300+ records
- news_sentiment - ML-analyzed news with FinBERT sentiment scores
- sentiment_aggregates - Hourly sentiment rollups
- sentiment_signals - Trading signals with backtest results
- market_events - FOMC, CPI, NFP event tracking
- features_returns_rsi - Returns (1d, 5d, 20d) + RSI indicators (14, 28 period)
- features_vwap_volume - VWAP, volume analytics, and anomaly detection
- daily_metrics - Daily aggregated stats per symbol
- v_latest_yields - Current Treasury yield snapshot
- v_yield_curve - Real-time yield curve
- v_treasury_etf_correlation - Treasury-ETF relationship metrics
GET /symbols - List all securities
GET /bars/{ticker} - OHLCV price data
GET /trades/{ticker} - Trade records
GET /analytics/rsi/{ticker} - RSI analysis
GET /analytics/vwap/{ticker} - VWAP analysis
GET /analytics/signals - Trading signals (overbought/oversold)
GET /treasury/yields/latest - Current Treasury yields (all maturities)
GET /treasury/yields/{maturity} - Historical yields (2Y, 5Y, 10Y, 30Y)
GET /treasury/yields/curve - Yield curve visualization
GET /treasury/etfs/latest - Current bond ETF prices
GET /treasury/etfs/{ticker} - Historical ETF data
GET /treasury/analytics/spread - Yield spread analysis (e.g., 10Y-2Y)
GET /treasury/analytics/correlation - Treasury-ETF correlation
GET /treasury/summary - Treasury data statistics
GET /sentiment/news/recent - Recent news with ML sentiment
GET /sentiment/news/high-impact - FOMC, CPI, Fed speaker news
GET /sentiment/aggregates/timeseries - Hourly sentiment trends
GET /sentiment/signals/recent - Trading signals from sentiment
GET /sentiment/signals/performance - Signal backtest results
GET /sentiment/analytics/sentiment-distribution - Sentiment breakdown
GET /sentiment/summary - Sentiment data statistics
# Latest prices with RSI signals
from analytics.run_analysis import AnalyticsEngine
with AnalyticsEngine() as engine:
signals = engine.get_rsi_signals()
print(signals)
# Output:
# symbol price rsi_14 rsi_signal
# MSFT 176.52 80.64 OVERBOUGHT β οΈ
# META 269.93 16.93 OVERSOLD β
- Database: DuckDB (embedded analytical database)
- ETL: Python with pandas, data validation
- API: FastAPI + Uvicorn
- Testing: pytest with 35+ test cases
- CI/CD: GitHub Actions
- Quick Start Guide - Get running in 5 minutes
- Usage Examples - Code samples and API usage
- Architecture - System design and components
- Contributing - Development guidelines
- Backtesting: Historical market data for strategy testing
- Research: Quantitative analysis and indicator development
- Education: Learn SQL, databases, and financial analytics
- Prototyping: Quick setup for financial applications
- Portfolio Projects: Demonstrate full-stack data engineering skills
pytest tests/ -v # Run all tests
pytest --cov=. --cov-report=html # With coverage report
make test # Using Makefilemarket-data-analytics-db/
βββ etl/ # ETL pipeline (generate, validate, load)
βββ sql/ # Database schema and views
βββ api/ # FastAPI REST endpoints
βββ analytics/ # Pre-built analytical queries
βββ tests/ # Test suite (pytest)
βββ examples/ # Usage examples
βββ docs/ # Detailed documentation
βββ data/ # Generated CSV data
Contributions welcome! Please check CONTRIBUTING.md for guidelines.
MIT License - See LICENSE for details
This project demonstrates production-ready data engineering practices including ETL design, database optimization, API development, testing, and CI/CD automation. Perfect for quantitative finance portfolios and learning modern data stack technologies.
Built with β€οΈ for the quantitative finance community