This project implements multi-class classification on the Fashion-MNIST dataset using two approaches built from scratch with NumPy:
- Logistic Regression Classifier with L2 regularization
- Neural Network with One Hidden Layer featuring various activation functions and dropout
The Fashion-MNIST dataset consists of 70,000 grayscale images (28x28 pixels) across 10 clothing categories: T-shirt/top, Trouser, Pullover, Dress, Coat, Sandal, Shirt, Sneaker, Bag, and Ankle boot.
Ex2.py- Main implementation file containing all classifiers and utilitiestrain.csv- Training dataset (56,000 examples with labels)test.csv- Test dataset (14,000 examples without labels)Report_208520262_208980888.pdf- Detailed analysis and results report
numpy
pandas
matplotlib
scikit-learn
tqdm
- Ensure you have Python 3.x installed
- Install required packages:
pip install numpy pandas matplotlib scikit-learn tqdm
- Place
train.csvandtest.csvin the same directory asEx2.py
Execute the main script:
python Ex2.pyThis will automatically run through all three parts:
- Displays a 10x4 grid showing 4 examples from each of the 10 fashion categories
- Each row represents a different clothing class with proper labels
Hyperparameter Search:
- Batch sizes: [128, 256, 512]
- Learning rates: [0.001, 0.01, 0.05]
- Regularization coefficients: [1e-07, 0.001]
Process:
- Automatically splits training data (80% train, 20% validation)
- Normalizes pixel values using min-max normalization
- Applies one-hot encoding to labels
- Tests all hyperparameter combinations
- Selects best model based on validation accuracy
- Generates predictions on test set
Output: lr_pred.csv - Contains predictions for test dataset
Hyperparameter Search:
- Batch size: 128
- Learning rate: 0.5
- Regularization coefficient: 1e-08
- Activation functions: [ReLU, Sigmoid, Tanh]
- Hidden layer sizes: [256, 128, 10]
- Dropout probabilities: [1.0, 0.9, 0.8, 0.5] (keep probability)
Process:
- Implements forward and backward propagation from scratch
- Tests all hyperparameter combinations with progress bars
- Selects best model based on validation accuracy
- Applies dropout during training for regularization
- Generates predictions on test set
Output: NN_pred.csv - Contains predictions for test dataset
- Numerically Stable Softmax: Uses
softmax(z - max(z))to prevent overflow - Vectorized Operations: Efficient NumPy implementations for all computations
- Mini-batch Gradient Descent: Configurable batch sizes for optimization
- L2 Regularization: Prevents overfitting in both models
- Dropout: Neural network includes dropout for additional regularization
Logistic Regression:
- Multi-class classification using softmax activation
- Cross-entropy loss function
- L2 regularization term
Neural Network:
- Input layer: 784 features (28x28 flattened images)
- Hidden layer: Variable size with selectable activation function
- Output layer: 10 classes with softmax activation
- Dropout applied to hidden layer during training
- Best Configuration: Batch size=128, Learning rate=0.001, Regularization=1e-07
- Performance: ~87% training accuracy, ~85% validation accuracy
- Key Finding: Lower learning rates (0.001) provided more stable training
- Best Configuration: Hidden size=256, ReLU activation, No dropout
- Performance: ~86% training accuracy, ~85% validation accuracy
- Key Findings:
- ReLU and Tanh outperformed Sigmoid activation
- Larger hidden layers (256) achieved better performance
- Dropout showed minimal impact on performance
lr_pred.csv- Logistic regression predictions (one prediction per line, 0-9)NN_pred.csv- Neural network predictions (one prediction per line, 0-9)
- The code includes comprehensive hyperparameter search with progress tracking
- All models are implemented from scratch using only NumPy for core computations
- Training includes real-time loss and accuracy monitoring
- Best models are automatically selected based on validation performance
- Results visualization includes training curves for the optimal configurations