Skip to content

Production-ready quantitative finance data warehouse with ETL pipeline, REST API, and technical indicators (RSI, VWAP). Built with DuckDB, Python, and FastAPI.

License

Notifications You must be signed in to change notification settings

King0508/duckdb-quant-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

45 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Quantitative Finance Data Warehouse

CI/CD Pipeline Python 3.9+ License: MIT

A production-ready data warehouse for quantitative finance analysis. Unified platform combining equity markets, fixed-income Treasuries, and ML-powered sentiment analysis. Built with DuckDB for blazing-fast analytical queries.

✨ Features

πŸ“ˆ Equity Markets

  • ⚑ Fast Analytics: DuckDB columnar storage optimized for OLAP workloads
  • πŸ“Š Technical Indicators: Pre-computed RSI, VWAP, returns, and volume analytics
  • πŸ’Ή Real-time Data: 6,000+ OHLCV bars and 125,000+ trade records
  • πŸ”„ Complete ETL Pipeline: Automated data generation, validation, and loading

🏦 Fixed-Income / Treasuries

  • πŸ“‰ Treasury Yields: US 2Y, 5Y, 10Y, 30Y yield curves
  • πŸ’° ETF Tracking: TLT, IEF, SHY, LQD, HYG bond ETFs
  • πŸ“Š Spread Analysis: Yield curve inversion detection
  • πŸ”— Correlations: Treasury-ETF relationship analysis

πŸ€– Sentiment Analytics (Integration Ready)

  • πŸ“° News Analysis: ML-powered sentiment scoring (FinBERT)
  • 🎯 Trading Signals: Sentiment-based buy/sell signals
  • πŸ”” Event Detection: FOMC, CPI, NFP high-impact news
  • πŸ“ˆ Performance Tracking: Signal backtesting and P&L

πŸ› οΈ Infrastructure

  • 🌐 Unified REST API: Single API serving all data types
  • βœ… Production Ready: Comprehensive tests, CI/CD, logging
  • πŸ“ˆ No API Keys Needed: Synthetic data generation for offline development
  • πŸ”Œ Easy Integration: Connects with sentiment analysis projects

πŸš€ Quick Start

# Clone and setup
git clone https://github.com/King0508/market-data-analytics-db.git
cd market-data-analytics-db
python -m venv venv
venv\Scripts\activate  # Windows (on Mac/Linux: source venv/bin/activate)
pip install -r requirements.txt

# Generate equity market data
python etl/generate_data.py
python etl/load_data.py

# Generate Treasury & fixed-income data
python etl/generate_treasury_data.py --days 365

# Run equity analytics
python analytics/run_analysis.py

# Start unified REST API
python -m api.main
# Visit http://localhost:8000/docs for interactive API documentation

Verify Installation

# Test API (in another terminal)
curl http://localhost:8000/

# Should return:
{
  "name": "Quantitative Finance Data Warehouse API",
  "version": "2.0.0",
  "features": {
    "equity_market_data": true,
    "treasury_fixed_income": true,
    "sentiment_analysis": true
  }
}

πŸ“Š What's Inside

πŸ“¦ Database Schema

Equity Tables

  • symbols - Security master data (ticker, name, sector, market cap)
  • bars - OHLCV price data with 6,000+ daily bars
  • trades - 125,000+ individual trade records

Treasury Tables

  • treasury_yields - US Treasury yields (2Y, 5Y, 10Y, 30Y) with 1,000+ records
  • fixed_income_etfs - Bond ETF prices (TLT, IEF, SHY, LQD, HYG) with 1,300+ records

Sentiment Tables (Integration Ready)

  • news_sentiment - ML-analyzed news with FinBERT sentiment scores
  • sentiment_aggregates - Hourly sentiment rollups
  • sentiment_signals - Trading signals with backtest results
  • market_events - FOMC, CPI, NFP event tracking

πŸ“Š Analytical Views

  • features_returns_rsi - Returns (1d, 5d, 20d) + RSI indicators (14, 28 period)
  • features_vwap_volume - VWAP, volume analytics, and anomaly detection
  • daily_metrics - Daily aggregated stats per symbol
  • v_latest_yields - Current Treasury yield snapshot
  • v_yield_curve - Real-time yield curve
  • v_treasury_etf_correlation - Treasury-ETF relationship metrics

🌐 API Endpoints

Equity Endpoints

GET /symbols                    - List all securities
GET /bars/{ticker}              - OHLCV price data
GET /trades/{ticker}            - Trade records
GET /analytics/rsi/{ticker}     - RSI analysis
GET /analytics/vwap/{ticker}    - VWAP analysis
GET /analytics/signals          - Trading signals (overbought/oversold)

Treasury Endpoints

GET /treasury/yields/latest     - Current Treasury yields (all maturities)
GET /treasury/yields/{maturity} - Historical yields (2Y, 5Y, 10Y, 30Y)
GET /treasury/yields/curve      - Yield curve visualization
GET /treasury/etfs/latest       - Current bond ETF prices
GET /treasury/etfs/{ticker}     - Historical ETF data
GET /treasury/analytics/spread  - Yield spread analysis (e.g., 10Y-2Y)
GET /treasury/analytics/correlation - Treasury-ETF correlation
GET /treasury/summary           - Treasury data statistics

Sentiment Endpoints (Integration Ready)

GET /sentiment/news/recent      - Recent news with ML sentiment
GET /sentiment/news/high-impact - FOMC, CPI, Fed speaker news
GET /sentiment/aggregates/timeseries - Hourly sentiment trends
GET /sentiment/signals/recent   - Trading signals from sentiment
GET /sentiment/signals/performance - Signal backtest results
GET /sentiment/analytics/sentiment-distribution - Sentiment breakdown
GET /sentiment/summary          - Sentiment data statistics

πŸ“ˆ Example Output

# Latest prices with RSI signals
from analytics.run_analysis import AnalyticsEngine

with AnalyticsEngine() as engine:
    signals = engine.get_rsi_signals()
    print(signals)

# Output:
#   symbol  price  rsi_14  rsi_signal
#   MSFT   176.52   80.64  OVERBOUGHT  ⚠️
#   META   269.93   16.93  OVERSOLD    βœ…

πŸ› οΈ Tech Stack

  • Database: DuckDB (embedded analytical database)
  • ETL: Python with pandas, data validation
  • API: FastAPI + Uvicorn
  • Testing: pytest with 35+ test cases
  • CI/CD: GitHub Actions

πŸ“š Documentation

🎯 Use Cases

  • Backtesting: Historical market data for strategy testing
  • Research: Quantitative analysis and indicator development
  • Education: Learn SQL, databases, and financial analytics
  • Prototyping: Quick setup for financial applications
  • Portfolio Projects: Demonstrate full-stack data engineering skills

πŸ§ͺ Testing

pytest tests/ -v                    # Run all tests
pytest --cov=. --cov-report=html   # With coverage report
make test                           # Using Makefile

πŸ“¦ Project Structure

market-data-analytics-db/
β”œβ”€β”€ etl/              # ETL pipeline (generate, validate, load)
β”œβ”€β”€ sql/              # Database schema and views
β”œβ”€β”€ api/              # FastAPI REST endpoints
β”œβ”€β”€ analytics/        # Pre-built analytical queries
β”œβ”€β”€ tests/            # Test suite (pytest)
β”œβ”€β”€ examples/         # Usage examples
β”œβ”€β”€ docs/             # Detailed documentation
└── data/             # Generated CSV data

🀝 Contributing

Contributions welcome! Please check CONTRIBUTING.md for guidelines.

πŸ“ License

MIT License - See LICENSE for details

πŸŽ“ About

This project demonstrates production-ready data engineering practices including ETL design, database optimization, API development, testing, and CI/CD automation. Perfect for quantitative finance portfolios and learning modern data stack technologies.


Built with ❀️ for the quantitative finance community

About

Production-ready quantitative finance data warehouse with ETL pipeline, REST API, and technical indicators (RSI, VWAP). Built with DuckDB, Python, and FastAPI.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published