Stock Analysis System

Github: https://github.com/JasonZe41/Stock_Dashboard

Stock Analysis System

This project implements a Lambda architecture for stock data analysis, combining batch processing of historical data with real-time processing of current market data.

System Architecture

Batch Layer

Data Source: Historical stock data from 1999-2022
Storage: Apache Hive for data warehousing
Processing:
1. Raw stock data imported into Hive tables
2. Stock metrics calculated and stored in a dedicated metrics table
3. Data mapped to HBase for quick access

Speed Layer

Data Source: Real-time stock data from Polygon API (post-2022)
Processing:
1. Real-time data fetched based on user queries
2. Data processed through Kafka messaging system
3. StreamStocks consumer processes messages and updates HBase

Serving Layer

Node.js frontend application
Allows users to query stock metrics by symbol and date
Automatically routes to batch or speed layer based on date

Setup and Running Instructions

Frontend Application

SSH into the cluster:

ssh -i /Users/jasonze/.ssh/id_ed25519 -C2qTnNf -D 9876 sshuser@hbase-mpcs53014-2024-ssh.azurehdinsight.net

Navigate to the application directory:

cd home/sshuser/yanze41/app3

Install dependencies and start the server:

npm install
node app.js 3041 http://10.0.0.26:8090 $KAFKABROKERS

Kafka Consumer (Speed Layer)

SSH into the cluster (use the same command as above)
Navigate to the target directory:

cd home/sshuser/yanze41/target

Submit the Spark job:

spark-submit \
  --driver-java-options "-Dlog4j.configuration=file:///home/sshuser/ss.log4j.properties" \
  --class StreamStocks \
  uber-speedLayerKafka-1.0-SNAPSHOT.jar \
  $KAFKABROKERS

Usage

Access the web interface
Enter a stock symbol (e.g., AAPL, GOOGL)
Select a date:
- For dates between 1999-2022: Data served from batch layer (HBase)
- For dates after 2022: Real-time data fetched from Polygon API through speed layer

Architecture Flow

User submits query through frontend
System checks date:
- Historical data: Retrieved directly from HBase
- Recent data: Fetched from Polygon API, processed through Kafka, and stored in HBase
Results displayed to user with calculated metrics

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
screenshots		screenshots
.gitattributes		.gitattributes
README.md		README.md
StockReport.scala		StockReport.scala
StreamStock.scala		StreamStock.scala
app.js		app.js
import_to_hive.hql		import_to_hive.hql
index.html		index.html
stock.mustache		stock.mustache
write_to_hbase.hql		write_to_hbase.hql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stock Analysis System

System Architecture

Batch Layer

Speed Layer

Serving Layer

Setup and Running Instructions

Frontend Application

Kafka Consumer (Speed Layer)

Usage

Architecture Flow

Screenshots

About

Uh oh!

Releases

Packages

Languages

JasonZe41/Stock_Dashboard

Folders and files

Latest commit

History

Repository files navigation

Stock Analysis System

System Architecture

Batch Layer

Speed Layer

Serving Layer

Setup and Running Instructions

Frontend Application

Kafka Consumer (Speed Layer)

Usage

Architecture Flow

Screenshots

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages