A PyTorch implementation of a Convolutional Neural Network (CNN) for classifying Fashion-MNIST dataset. This project demonstrates image classification using deep learning techniques with batch normalization and proper training/validation procedures.
The Fashion-MNIST dataset consists of 70,000 grayscale images of 10 different clothing categories:
- T-shirt/top
- Trouser
- Pullover
- Dress
- Coat
- Sandal
- Shirt
- Sneaker
- Bag
- Ankle boot
Each image is 28x28 pixels, and in this implementation, we resize them to 16x16 for faster training.
The CNN model includes:
- First Conv Block: Conv2d(1→16) + BatchNorm + ReLU + MaxPool
- Second Conv Block: Conv2d(16→32) + BatchNorm + ReLU + MaxPool
- Fully Connected: Linear(512→10) + BatchNorm
- Output: 10 classes (Fashion-MNIST categories)
Key Features:
- Batch normalization for better training stability
- ReLU activation functions
- Max pooling for dimensionality reduction
- Cross-entropy loss for multi-class classification
- SGD optimizer with learning rate 0.1
- Python 3.7+
- PyTorch 2.0+
- torchvision
- matplotlib
- numpy
- pillow
- Clone this repository or download the files
- Install the required dependencies:
pip install -r requirements.txtRun the main script to start training:
python main.pyThe script will:
- Download the Fashion-MNIST dataset automatically
- Display sample images from the dataset
- Initialize the CNN model with batch normalization
- Train for 5 epochs with validation
- Plot training loss and validation accuracy
- Save the trained model
- Image Size: 16x16 pixels
- Batch Size: 100
- Learning Rate: 0.1
- Epochs: 5
- Optimizer: SGD
- Loss Function: Cross-Entropy
The script generates:
sample_data.png: Sample images from the datasettraining_results.png: Training loss and validation accuracy plotfashion_mnist_cnn.pth: Trained model checkpoint
The model typically achieves:
- Training convergence within 5 epochs
- Validation accuracy of ~85-90%
- Fast training due to reduced image size (16x16)
.
├── main.py # Main training script
├── requirements.txt # Python dependencies
├── README.md # Project documentation
├── fashion/ # Dataset directory (created automatically)
├── sample_data.png # Generated sample images
├── training_results.png # Generated training plots
└── fashion_mnist_cnn.pth # Saved model checkpoint
- Automatic dataset download
- Image resizing to 16x16 pixels
- Tensor conversion with proper normalization
- Convolutional layers with batch normalization
- Proper forward pass implementation
- GPU support (if available)
- Proper train/validation split
- Batch processing with DataLoader
- Real-time training progress monitoring
- Model checkpointing
- Sample data visualization
- Training metrics plotting
- Results visualization
You can modify the following parameters in main.py:
IMAGE_SIZE = 16 # Image dimensions
BATCH_SIZE = 100 # Batch size for training
LEARNING_RATE = 0.1 # Learning rate for SGD
NUM_EPOCHS = 5 # Number of training epochsThe script automatically detects and uses GPU if available:
- CUDA-enabled GPU will be used automatically
- Falls back to CPU if GPU is not available
Fashion-MNIST is a dataset of Zalando's article images consisting of:
- 60,000 training examples
- 10,000 test examples
- 10 classes
- 28x28 grayscale images
Original dataset: https://github.com/zalandoresearch/fashion-mnist
This project is based on educational material and is intended for learning purposes.
- Original Fashion-MNIST dataset by Zalando Research
- PyTorch community for excellent deep learning framework
- Educational content adapted from IBM's deep learning course materials
- CUDA out of memory: Reduce batch size or use CPU
- Slow training: Enable GPU or reduce image size
- Dependencies issues: Make sure all packages are installed correctly
- Use GPU for faster training
- Increase batch size if you have more memory
- Adjust learning rate based on convergence behavior
- Monitor validation accuracy to avoid overfitting
The training script provides:
- Real-time loss and accuracy monitoring
- Final model performance metrics
- Visual plots of training progress
- Saved model for future use
Expected results:
- Training loss should decrease over epochs
- Validation accuracy should improve and stabilize
- Model should achieve reasonable classification performance