GitHub - codeToad101/stockPredicts

S&P 500 Directional Prediction Model This project investigates short-horizon directional prediction of the S&P 500 using historical market data and supervised machine learning. The goal is not to construct a profitable trading system, but to examine what information from the recent past can meaningfully inform near-future market direction under proper time-series constraints.

Overview The model predicts whether the S&P 500 will move up or down over a fixed forward horizon using a rolling window of historical features (returns, momentum, volatility indicators, etc.).

Key constraints of the design:

The model never uses future information relative to a prediction point
All features are computed strictly from past data
Evaluation is performed using chronological (rolling) train/test splits to reflect real-world deployment

Problem Setup -->Task: Binary classification (Up / Down) -->Target: Direction of the S&P 500 over a fixed forward window -->Input: Fixed-length rolling window of historical market features -->Evaluation: Out-of-sample test data occurring strictly after the training period

Data & Features -->Historical S&P 500 price data from YFinance

Feature set includes, but is not limited to: -->Lagged returns -->Moving averages -->Rolling volatility measures -->Momentum indicators

Feature windows are fixed in length and computed independently for each prediction timestamp, ensuring temporal causality is preserved.

Model

Supervised classification model (RandomForest)
Trained once on historical data
Applied to future periods without retraining during evaluation
Class imbalance can be altered for varied strategies via decision threshold adjustment This design emphasizes interpretability, reproducibility, and methodological correctness over model complexity.

Evaluation Methodology

To avoid look-ahead bias and data leakage, training and test sets are split chronologically. The model is evaluated only on future data it has never seen. Performance is reported on a held-out test period.

Representative Test Set Performance Accuracy: ~0.74 Weighted F1: ~0.73

Performance varies across market regimes, which is expected in non-stationary financial time series.

Interpretation of Results

The model performs meaningfully better than random guessing, suggesting that recent market structure contains limited but non-zero predictive signal. Performance is asymmetric across classes, reflecting class imbalance and regime-dependent behavior. Results should be interpreted as statistical signal detection, not as evidence of a consistently tradable edge.

Limitations

Financial markets are non-stationary; learned patterns may decay over time. No transaction costs, slippage, or risk management are modeled. Directional accuracy alone is insufficient for profitability. Results are sensitive to feature window length and market regime.

This project was built to:

Practice proper time-series ML methodology
Avoid common pitfalls such as shuffled validation and data leakage
Explore the practical limits of short-term market predictability
Serve as a clean, reproducible reference for financial ML experiments

Future Work:

Expanding-window or walk-forward retraining
Regime-aware or adaptive modeling
Probabilistic calibration and uncertainty estimation
Incorporation of macroeconomic or cross-asset signals

Disclaimer

This project is for educational and research purposes only. It does not constitute financial advice and should not be used for live trading.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
README.md		README.md
stoxnow.ipynb		stoxnow.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages