Skip to content

Anishyou/Imageclassifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ–ΌοΈ PyTorch Image Classifier

Python PyTorch License

A comprehensive, step-by-step deep learning project for image classification using PyTorch. Learn to build, train, and deploy CNN models from scratch, and apply transfer learning for state-of-the-art results.

Sample Predictions


✨ Features

  • πŸ“š 8 Progressive Learning Steps - From data loading to deploying on your own images
  • 🧠 Custom CNN Architecture - Build a convolutional neural network from scratch
  • πŸ”„ Transfer Learning - Use pre-trained ResNet18 for ~90% accuracy
  • πŸ“ˆ Data Augmentation - Boost performance with image transformations
  • ⚑ GPU Support - Automatic CUDA/MPS detection for fast training
  • πŸ–ΌοΈ Classify Your Own Images - Use the trained model on any image

πŸ“‹ Learning Path

Step File What You'll Learn Difficulty
1 steps/step1_data_loading.py Datasets, transforms, DataLoaders ⭐
2 steps/step2_build_model.py CNN architecture (Conv, Pool, FC layers) ⭐⭐
3 steps/step3_train_model.py Training loop, loss functions, optimizers ⭐⭐
4 steps/step4_evaluate_and_predict.py Evaluation, predictions, confusion matrix ⭐⭐
5 steps/step5_data_augmentation.py Image augmentation techniques ⭐⭐
6 steps/step6_transfer_learning.py Pre-trained models, fine-tuning ⭐⭐⭐
7 steps/step7_learning_rate_scheduler.py Learning rate scheduling strategies ⭐⭐⭐
8 steps/step8_your_own_images.py Classify your own images! ⭐

πŸš€ Quick Start

1. Clone the Repository

git clone https://github.com/Anishyou/Imageclassifier.git
cd Imageclassifier

2. Install Dependencies

pip install -r requirements.txt

3. Run the Steps

cd steps

# Learn the fundamentals
python step1_data_loading.py      # Understand data loading
python step2_build_model.py       # Explore CNN architecture

# Train and evaluate
python step3_train_model.py       # Train the model (~10 min CPU, ~2 min GPU)
python step4_evaluate_and_predict.py  # See results

# Advanced techniques
python step5_data_augmentation.py     # Data augmentation
python step6_transfer_learning.py     # Transfer learning with ResNet
python step7_learning_rate_scheduler.py  # LR scheduling

# Use on your own images
python step8_your_own_images.py   # Classify any image!

πŸ“Š Dataset: CIFAR-10

Property Value
Total Images 60,000 (50k train, 10k test)
Image Size 32Γ—32 RGB
Classes 10

Classes: ✈️ airplane, πŸš— automobile, 🐦 bird, 🐱 cat, 🦌 deer, πŸ• dog, 🐸 frog, 🐴 horse, 🚒 ship, 🚚 truck


πŸ—οΈ Model Architecture

Custom CNN (from scratch)

Input (3Γ—32Γ—32)
    ↓
Conv1 (32 filters) β†’ BatchNorm β†’ ReLU β†’ MaxPool β†’ (32Γ—16Γ—16)
    ↓
Conv2 (64 filters) β†’ BatchNorm β†’ ReLU β†’ MaxPool β†’ (64Γ—8Γ—8)
    ↓
Conv3 (128 filters) β†’ BatchNorm β†’ ReLU β†’ MaxPool β†’ (128Γ—4Γ—4)
    ↓
Flatten (2048)
    ↓
FC1 (256) β†’ ReLU β†’ Dropout(0.5)
    ↓
FC2 (10) β†’ Output (class scores)

Parameters: ~596K trainable parameters


πŸ“ˆ Results

Model Accuracy Training Time
Custom CNN (10 epochs) ~70-75% ~10 min (GPU)
With Data Augmentation ~75-80% ~15 min (GPU)
Transfer Learning (ResNet18) ~85-92% ~20 min (GPU)

πŸ’Ύ Trained Models

Pre-trained model weights are included in the models/ folder:

File Description How to Use
models/best_model.pth Custom CNN trained on CIFAR-10 Load with step2_build_model.ImageClassifier
models/feature_extractor_best.pth ResNet18 transfer learning Load with torchvision.models.resnet18

Loading a Trained Model

import torch
import sys
sys.path.append('steps')
from step2_build_model import ImageClassifier

# Load custom CNN
model = ImageClassifier(num_classes=10)
checkpoint = torch.load('models/best_model.pth')
model.load_state_dict(checkpoint['model_state_dict'])
model.eval()

# Check accuracy achieved
print(f"Best accuracy: {checkpoint['best_acc']:.2f}%")

πŸ–ΌοΈ Classify Your Own Images

cd steps
python step8_your_own_images.py --image path/to/your/image.jpg

Or use in code (from project root):

from PIL import Image
import torch
import torchvision.transforms as transforms
import sys
sys.path.append('steps')
from step2_build_model import ImageClassifier

# Classes
classes = ('airplane', 'automobile', 'bird', 'cat', 'deer',
           'dog', 'frog', 'horse', 'ship', 'truck')

# Load model
model = ImageClassifier(num_classes=10)
checkpoint = torch.load('models/best_model.pth')
model.load_state_dict(checkpoint['model_state_dict'])
model.eval()

# Prepare image
transform = transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2470, 0.2435, 0.2616))
])

image = Image.open('your_image.jpg').convert('RGB')
input_tensor = transform(image).unsqueeze(0)

# Predict
with torch.no_grad():
    output = model(input_tensor)
    _, predicted = output.max(1)
    
print(f"Prediction: {classes[predicted.item()]}")

πŸ“ Project Structure

Imageclassifier/
β”œβ”€β”€ πŸ“‚ steps/                         # All learning step files
β”‚   β”œβ”€β”€ step1_data_loading.py         # Data loading tutorial
β”‚   β”œβ”€β”€ step2_build_model.py          # CNN architecture
β”‚   β”œβ”€β”€ step3_train_model.py          # Training loop
β”‚   β”œβ”€β”€ step4_evaluate_and_predict.py # Evaluation
β”‚   β”œβ”€β”€ step5_data_augmentation.py    # Augmentation
β”‚   β”œβ”€β”€ step6_transfer_learning.py    # Transfer learning
β”‚   β”œβ”€β”€ step7_learning_rate_scheduler.py  # LR scheduling
β”‚   └── step8_your_own_images.py      # Use your own images
β”œβ”€β”€ πŸ“‚ models/                        # Trained model weights
β”‚   β”œβ”€β”€ best_model.pth                # Custom CNN weights
β”‚   └── feature_extractor_best.pth    # Transfer learning weights
β”œβ”€β”€ πŸ“‚ outputs/                       # Generated images & plots
β”‚   └── (training curves, predictions, etc.)
β”œβ”€β”€ πŸ“‚ data/                          # CIFAR-10 dataset (auto-downloaded)
β”œβ”€β”€ πŸ“„ requirements.txt               # Dependencies
β”œβ”€β”€ πŸ“„ LICENSE                        # MIT License
└── πŸ“„ README.md                      # This file

πŸ’‘ Key Concepts

πŸ”Ή Data Transforms
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])
πŸ”Ή Training Loop
for epoch in range(epochs):
    for images, labels in train_loader:
        optimizer.zero_grad()           # Reset gradients
        outputs = model(images)         # Forward pass
        loss = criterion(outputs, labels)  # Compute loss
        loss.backward()                 # Backward pass
        optimizer.step()                # Update weights
πŸ”Ή Evaluation Mode
model.eval()
with torch.no_grad():
    outputs = model(images)
    _, predicted = outputs.max(1)
πŸ”Ή Transfer Learning
from torchvision import models

# Load pre-trained ResNet18
model = models.resnet18(weights=models.ResNet18_Weights.IMAGENET1K_V1)

# Replace final layer for 10 classes
model.fc = nn.Linear(512, 10)

πŸ› οΈ Requirements

  • Python 3.8+
  • PyTorch 2.0+
  • torchvision
  • matplotlib
  • numpy
  • tqdm
  • Pillow

🀝 Contributing

Contributions are welcome! Feel free to:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ™ Acknowledgments


Made with ❀️ for learning deep learning

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages