Skip to content

Anjula-valluru/BlueChip_StockPredictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Stock Predictor — Auto-Regression & Auto-ARIMA (Python Implementation)

Project Overview

This project focuses on time-series based stock prediction using Auto-Regression (AR) and Auto-ARIMA models.
Originally developed in R, it has been converted into Python with equivalent functionality for reproducibility and deployment.

The goal is to predict future stock closing prices using past historical values and provide insights into stock performance across multiple companies.


Key Features

  • Data Preprocessing:

    • Handling missing values (NaN)
    • Checking normality (Shapiro-Wilk test, Boxplot, Histogram, Density, Q-Q plot)
    • Outlier detection (IQR method)
    • Stationarity checks (Rolling mean/std, Augmented Dickey-Fuller test)
    • Differencing for non-stationary series
  • Exploratory Data Analysis (EDA):

    • Trend, seasonal, and residual decomposition
    • Visualization of ACF (Autocorrelation) and PACF (Partial Autocorrelation) plots
    • Statistical checks for time-series stationarity
  • Model Building:

    • p-th order Auto-Regression Analysis (manual AR models)
    • Auto-ARIMA for automatic (p, d, q) order selection
    • ARIMA model training with expanding-window (one-step ahead) forecasting
    • Multi-company analysis across 12 stocks:
      • Apple, TCS, Tesla, Dr Reddy’s Lab, Abott, IBM, Nvidia, Google, Accenture, Microsoft, Amazon, HP
  • Evaluation:

    • Accuracy metrics: RMSE, MAE, MAPE, ME, MPE
    • Covariance matrices for lagged features
    • Error comparisons across different p values for individual companies
  • Outputs:

    • Predicted vs Actual plots
    • Stationarity diagnostics
    • Accuracy JSON reports
    • Covariance CSVs
    • Multi-company forecasts

Dataset Link

https://www.kaggle.com/datasets/minatverma/nse-stocks-data


Technologies & Libraries

  • Python stack:

    • pandas, numpy, matplotlib
    • statsmodels (ADF test, ARIMA)
    • pmdarima (Auto-ARIMA)
    • scipy (Shapiro-Wilk)
    • scikit-learn (metrics)
  • From R original:

    • ggplot2, zoo, tseries, forecast, tidyverse

Project Structure

stock_auto_arima_py/
│── README.md
│── requirements.txt
│
├── src/stock_arima/
│   ├── __init__.py
│   ├── utils.py         # Accuracy metrics, IQR bounds
│   ├── io.py            # CSV ingestion with Date parsing
│   ├── eda.py           # Stationarity, rolling stats, ACF/PACF, plots
│   ├── modeling.py      # ARIMA, Auto-ARIMA, iterative predictions
│
└── scripts/
    ├── run_univariate.py      # Run full EDA + ARIMA pipeline for companies
    └── multivariate_tests.py  # Multivariate stationarity checks (ADF on all cols)

How to Run

Install dependencies

pip install -r requirements.txt

Univariate EDA + Forecast (single company)

python scripts/run_univariate.py   --csv "Path/stocks_ida.csv"   --company "apple"   --col "Close"   --outdir outputs/apple

Batch Forecast (multiple companies)

python scripts/run_univariate.py   --csv "Path/stocks_ida.csv"   --companies "apple,TCS,tesla,dr reddy lab,abott,IBM,nvdia,google,accenture,micro soft,amazon,Hp"   --col "Close"   --outdir outputs/batch

Multivariate Stationarity Check

python scripts/multivariate_tests.py   --csv "Path/stocks_ida.csv"   --company "apple"   --start-col-index 3   --outdir outputs/multi

Results & Insights

  • Stationarity: Most series were non-stationary initially; differencing made them stationary.
  • Order selection: Auto-ARIMA identified optimal (p,d,q) orders, validated with PACF plots.
  • Prediction style: Day-by-day iterative predictions (not all at once) for realistic forecasting.
  • Best performing stocks: Microsoft and Accenture showed increasing predicted trends; identified as good investment candidates.
  • Error metrics: Comparative RMSE values were tabulated for all 12 companies. Example (from R study):
    • Apple: p=2, RMSE=0.3757
    • Tesla: p=0, RMSE=0.2453
    • Microsoft: p=2, RMSE=0.7382

Conclusion

This project demonstrates how time-series forecasting with Auto-Regression and Auto-ARIMA can provide actionable insights into stock price movements.
It balances statistical rigor (stationarity, decomposition, ACF/PACF) with predictive power (Auto-ARIMA, iterative ARIMA), offering a reusable Python-based framework for financial analytics.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages