SQL-Driven Financial Analytics, Risk Simulation, and Reporting Platform
The Asset Intelligence Workbench is a portfolio project designed to feel like a real internal analyst tool. It combines structured SQL-backed data storage, market data ingestion, return and risk analytics, Monte Carlo simulation, live news sentiment context, and a detailed PDF report in a single workflow.
The goal of the project was to build something closer to a finance, treasury, strategic finance, or risk decision-support tool that could plausibly be used inside a modern company.
https://asset-intelligence-workbench-eq2zbxelprxwjnnxendyqp.streamlit.app/
The local development SQLite database lives at data/app.db by default.
- Override it with
SQLITE_DB_PATHwhen needed. - Override the writable runtime fallback root with
SQLITE_RUNTIME_DIRorSQLITE_RUNTIME_PATHwhen needed. - Override the live market-data request timeout with
MARKET_DATA_PROVIDER_TIMEOUT_SECONDSwhen needed. - Configure market-data freshness with
MARKET_DATA_METADATA_FRESHNESS_HOURSandMARKET_DATA_PRICES_FRESHNESS_HOURS. - The app resolves relative SQLite paths against the repository root, not the current working directory.
- The app validates the resolved SQLite path on startup and logs the configured URL, resolved absolute path, and whether a runtime fallback was used.
- If the configured SQLite path is not writable, the app logs a warning and falls back to a writable runtime location before failing hard.
- The fallback order is: configured path -> runtime path -> unique ephemeral temp directory.
Hosted environments such as Streamlit Cloud often mount the repository in a read-only location. In that case the app will automatically move SQLite runtime writes to a writable temp/app-data directory instead of writing into the deployed repo.
- Local development keeps the configured path when it is writable.
- Hosted or restricted runtimes automatically switch to the writable runtime fallback when needed.
- A hard startup failure now occurs only if no writable SQLite location can be established at all.
The market-data path is now separated into a small storage-backed pipeline:
- provider layer:
src/data/providers/market_data_provider.py - ingestion service:
src/data/ingestion/service.py - SQL repository:
src/data/storage/repository.py - SQL query interface for app/report reads:
src/data/queries/market_data_queries.py
The intended flow is:
API -> normalize -> persist to SQL -> query from SQL -> analytics/reporting/app
The app and reporting layers should read market data from SQL through the query/repository layer rather than calling the provider directly.
The app now also validates a canonical asset dataset before downstream analysis runs:
entered asset -> normalize -> resolve stored vs fetch -> persist/read back -> validate identity + price history -> downstream analytics
- Stop the Streamlit app or any Python process using the database.
- Delete
data/app.db. - Restart the app so the schema is recreated automatically.
Older local database files from the previous layout can also be deleted if you are no longer using them:
data/processed/asset_intelligence.dbdata/app.db-waldata/app.db-shmdata/app.db-journaldata/processed/asset_intelligence.db-waldata/processed/asset_intelligence.db-shmdata/processed/asset_intelligence.db-journal
If you are deploying to a hosted environment with a read-only repo mount, you can still set SQLITE_DB_PATH explicitly, but in most cases the built-in runtime fallback will handle this automatically. If you want to control the hosted write location, set SQLITE_RUNTIME_DIR.
Run this before starting the app if you want to verify the exact local DB target and writeability outside Streamlit:
python -m src.database.debug_sqlite_writeThe user can enter or select a financial instrument such as a stock, ETF, or crypto asset by entering its ticker/CUSIP at the top right. The system then:
- pulls and stores historical market data on the asset
- retrieves and stores recent news based sentiment data using an API
- structures both datasets in a SQL database
- calculates return and downside risk metrics
- runs a simulation analysis
- displays the output in a Streamlit app
- generates a multi-page analyst-style PDF briefing
Historical asset data is pulled from external APIs and normalized into a local SQL-backed store.
The project uses a structured relational layer for assets, historical prices, data sources, and news articles. This makes the workflow reusable and closer to a real analytics system than a notebook only project.
The platform calculates:
- daily returns
- cumulative returns
- total return
- annualized return
The platform calculates:
- annualized volatility
- rolling volatility
- maximum drawdown
- historical Value at Risk (VaR)
- Expected Shortfall
The project includes forward simulation logic so the workbench can produce scenario oriented outputs rather than just historical description.
Recent news coverage is ingested, stored, and scored. This adds directional context without overcomplicating the architecture.
The ML subsystem is framed as a dynamic weighting engine rather than a toy price predictor. It combines three explicit research pillars:
- history / technical / market-structure signals
- risk / downside / scenario signals
- sentiment / external-information signals
The model target is forward 20-trading-day return rather than raw price level. That choice is more finance-oriented because returns are comparable across assets, less dominated by nominal price scale, and more aligned with decision-support workflows.
The workflow is intentionally structured:
- raw market and sentiment data are ingested and stored in SQL
- engineered features are built into history, risk, and sentiment pillars
- standardized
history_score,risk_score, andsentiment_scoreare created before model training - a ridge-based weighting engine learns an interpretable baseline allocation across signals
- a random-forest challenger model is trained as a nonlinear comparison benchmark
- outputs are evaluated with time-aware train/test splits and expanding-window validation
- scored results are written back to SQL for reuse by the app and the PDF report
The app and report expose:
- composite ML score
- directional signal
- confidence indicator
- pillar contribution breakdown
- learned pillar weights
- top feature importance
This keeps the ML layer grounded as an investment-research support tool rather than a black-box trading bot.
The reporting layer generates a formal PDF-style asset briefing with:
- summary
- performance overview
- risk analysis
- simulation section
- sentiment section
- methodology notes
asset-intelligence-workbench/
├── app/ # Streamlit UI
├── data/ # Local database / processed data
├── sql/ # Schema and SQL artifacts
├── src/
│ ├── analytics/ # Returns, risk, simulation logic
│ ├── database/ # ORM models, loaders, queries, connections
│ ├── ingestion/ # Market and sentiment ingestion workflows
│ ├── reporting/ # Report context + PDF generation
│ ├── utils/ # App-facing data preparation and config
│ └── visuals/ # Charts and visual helpers
├── tests/ # Focused unit tests
├── requirements.txt
└── README.md
## Running the ML layer locally
1. Activate the project environment.
2. Start the app with `streamlit run app\streamlit_app.py`.
3. Load or ingest a ticker with enough historical data.
4. Open the `Machine Learning Signal Calibration` section in the app.
5. Generate a PDF report to include the ML signal page in the briefing.
Focused verification commands:
```bash
python -m unittest tests.test_features tests.test_ml_models tests.test_ml_integration tests.test_app_data