Assignment 10 — Market Data Storage and Querying

This folder contains a fully modular workflow for the Market Data Storage and Querying with SQLite3 and Parquet assignment. Each concern lives in its own module so you can mix and match components depending on whether you are demonstrating ingestion, SQL analytics, Parquet analytics, or cross-format comparisons.

File guide

File / Dir	Purpose
`data_loader.py`	Loads the raw CSVs, validates ticker references, and returns ready-to-ingest DataFrames.
`sqlite_storage.py`	Creates the SQLite schema, inserts data, and exposes the four SQL query tasks.
`parquet_storage.py`	Materializes a ticker-partitioned Parquet dataset and implements the three Parquet analytics tasks plus storage comparisons.
`assignment10.py`	Command-line entry point that orchestrates ingestion, querying, and reporting.
`market_data.db`	SQLite3 database generated by the script (deleted/recreated when running with `--rebuild`).
`market_data/`	Parquet directory partitioned by ticker for fast symbol-specific scans.
`query_tasks.md`	Narrative of SQL & Parquet query outputs along with observed runtimes.
`comparison.md`	Discussion of SQLite vs Parquet storage characteristics and recommended use cases.
`tests/`	Unit tests covering ingestion and representative queries for both storage layers.

Source CSVs (market_data_multi.csv, tickers.csv) plus the schema definition remain unchanged and live alongside these modules.

Running the solution

cd assignment-10
python3 assignment10.py --rebuild

--rebuild drops the existing SQLite database and Parquet directory before re-loading from the CSVs. Omit the flag to reuse prior artifacts and skip the expensive load step.
Use --results /path/to/file.json to change the location of the JSON summary (results.json by default).

Dependencies: pandas (already available) and pyarrow (install once via pip3 install pyarrow). No other services are needed.

Delivered analytics

SQLite tasks

TSLA windowed extract – Pulls every Tesla trade between 2025‑11‑17 and 2025‑11‑18 and captures the query latency for the storage comparison.
Average daily volume per ticker – Aggregates each ticker’s average volume by trade date.
Top 3 tickers by total return – Uses first open and last close prices over the sample period to rank tickers.
First/last trade price per day – Returns closing prices for the first and last minute bar per ticker per day.

Parquet tasks

AAPL 5-minute rolling mean – Computes rolling averages over the close column using the partitioned Parquet dataset.
5-day rolling volatility – Resamples to daily closes, calculates daily returns, and builds a rolling standard deviation per ticker.
SQLite vs Parquet comparison – Compares runtime and file sizes for the TSLA window query between the two storage formats.

Open results.json after running the script to see the latest row counts, timings, and storage comparisons; re-run the script whenever you need fresh metrics or want to regenerate the SQLite/Parquet artifacts from the source CSVs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Assignment 10 — Market Data Storage and Querying

File guide

Running the solution

Delivered analytics

SQLite tasks

Parquet tasks

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
market_data		market_data
tests		tests
README.md		README.md
assignment10.py		assignment10.py
comparison.md		comparison.md
data_loader.py		data_loader.py
market_data.db		market_data.db
market_data.parquet		market_data.parquet
market_data_multi.csv		market_data_multi.csv
parquet_storage.py		parquet_storage.py
query_tasks.md		query_tasks.md
results.json		results.json
results.md		results.md
schema.sql		schema.sql
sqlite_storage.py		sqlite_storage.py
tickers.csv		tickers.csv

finm-python-for-finance/assignment-10

Folders and files

Latest commit

History

Repository files navigation

Assignment 10 — Market Data Storage and Querying

File guide

Running the solution

Delivered analytics

SQLite tasks

Parquet tasks

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages