Traffic Sign Recognition with Machine Learning

This repository contains the code for our final project in Iowa State University's COMS 5730 Machine Learning course. The project compares the performance of different machine learning and deep learning models on the German Traffic Sign Recognition Benchmark (GTSRB) under challenging conditions, particularly focusing on occlusion scenarios.

Dataset

The project uses the German Traffic Sign Recognition Benchmark (GTSRB) dataset, which contains over 50,000 images of 43 different traffic sign classes. The images are real-world photos taken under various lighting and weather conditions, with the dataset being used widely in traffic sign recognition research.

Project Overview

The goal of this project is to evaluate different machine learning models for traffic sign classification, especially under occlusion scenarios that autonomous vehicles may encounter in real-world environments. The project introduces random occlusion augmentation by adding black squares to images, simulating scenarios where signs might be partially blocked by objects like tree branches or other vehicles.

We evaluated:

CNN: Custom Convolutional Neural Network
ResNet-50: Pre-trained Residual Network with transfer learning
Vision Transformer (ViT): Pre-trained ViT with fine-tuning
Support Vector Machine (SVM): With PCA dimensionality reduction
K-Nearest Neighbors (KNN): With PCA dimensionality reduction

Performance was measured using accuracy and inference time, both critical factors for autonomous driving applications.

Final Report

For a comprehensive analysis of our methodology, experiments, and findings, please see our detailed final report. The report includes thorough explanations of:

Our data augmentation approach with occlusion
Model architectures and implementation details
Experimental setup and evaluation metrics
Complete results with visualizations
In-depth discussion and conclusions

Project Structure

.
├── algos/                  # Model implementations
│   ├── cnn.py              # Custom CNN implementation
│   ├── knn.py              # K-Nearest Neighbors implementation
│   ├── model.py            # Base model abstract class
│   ├── resnet.py           # ResNet50 implementation
│   ├── svm.py              # Support Vector Machine implementation
│   └── vit.py              # Vision Transformer implementation
├── utils/                  # Utility functions
│   ├── augmentation.py     # Data augmentation script
│   ├── dataset.py          # Custom dataset implementation
│   ├── logger.py           # Logging utilities
│   └── pca.py              # PCA dimensionality reduction
├── main.py                 # Main script to run experiments
├── experiment.py           # Experiment configuration
├── plotter.py              # Script for creating visualizations
├── prepare_datasets.sh     # Bash script for dataset preparation
├── prepare_datasets.ps1    # PowerShell script for dataset preparation
└── environment.yaml        # Conda environment specification

Key Features

Data Augmentation: Implements occlusion-based augmentation by adding 7 random black squares to images, simulating real-world partial obstruction of traffic signs.
Dimensionality Reduction: Uses PCA to reduce feature dimensions for traditional ML models (SVM, KNN).
Transfer Learning: Leverages pre-trained models (ResNet50, ViT) and fine-tunes them for the traffic sign classification task.
Performance Tracking: Comprehensive logging of training metrics, testing accuracy, and inference time.
Visualization: Generates comparison plots for model performance.

Custom CNN Architecture

Our custom CNN architecture consists of stacked convolutional layers with batch normalization, ReLU activation, dropout, and max pooling:

Experimental Results

Our experiments revealed significant differences in model performance:

Accuracy

ResNet-50: Achieved the highest testing accuracy of 98.18%
CNN: Reached 95.93% accuracy
ViT: Obtained 91.77% accuracy
SVM: Achieved 73.23% accuracy
KNN: Lowest performer with 50.51% accuracy

Inference Time

SVM: Extremely fast at 0.0005 ms per prediction
CNN: Efficient at 0.9763 ms
KNN: 1.3251 ms
ViT: 3.9087 ms
ResNet-50: Slowest at 4.5004 ms per prediction

Key Findings

ResNet-50 offered the highest accuracy but at the cost of higher inference time, making it suitable for applications where precision is more critical than speed.
CNN provided the best balance between accuracy and speed, making it most suitable for real-time traffic sign recognition systems.
SVM, despite its lower accuracy, could be valuable in scenarios where extremely fast inference time is prioritized over accuracy.
Data augmentation with occlusion helped improve model robustness to real-world scenarios.
Transfer learning from pre-trained models proved highly effective for this task.

Setup Instructions

Environment Setup

Clone this repository:

git clone https://github.com/nvan21/Traffic-Sign-Classification.git
cd Traffic-Sign-Classification

Create the conda environment:
```
conda env create -f environment.yaml
```

Activate the environment:

conda activate traffic-sign-classification

Dataset Preparation

Run the preparation script (choose based on your operating system):

For Linux/macOS:

chmod +x prepare_datasets.sh
./prepare_datasets.sh

For Windows:

.\prepare_datasets.ps1

These scripts will:

Download the GTSRB dataset from Kaggle
Extract the dataset
Create augmented versions of the training data with random occlusions
Perform PCA dimensionality reduction

Usage

Running the Experiments

To run all model experiments:

python main.py

Visualizing Results

After running the experiments, generate the performance comparison visualizations:

python plotter.py

This will create visualizations in the images/ directory showing:

Accuracy and loss curves for each model
Comparison of inference times
Comparison of testing accuracies

Conclusion

While ResNet-50 offers the highest testing accuracy, the custom CNN struck an excellent balance between high accuracy and low inference time, making it the most suitable model for high-speed traffic sign recognition. The Vision Transformer showed promising results despite not matching CNN performance. SVM demonstrated remarkably fast inference time, which could be useful in situations where speed is prioritized over accuracy.

Acknowledgements

The German Traffic Sign Recognition Benchmark dataset creators
PyTorch and scikit-learn libraries
Course instructors and teaching assistants

Authors: Nathan Van Utrecht, Patrick Whitehouse Course: COMS 5730 Machine Learning
Date: August 2024 - December 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Traffic Sign Recognition with Machine Learning

Dataset

Project Overview

Final Report

Project Structure

Key Features

Custom CNN Architecture

Experimental Results

Accuracy

Inference Time

Key Findings

Setup Instructions

Environment Setup

Dataset Preparation

Usage

Running the Experiments

Visualizing Results

Conclusion

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
algos		algos
assets		assets
utils		utils
.gitignore		.gitignore
README.md		README.md
environment.yaml		environment.yaml
experiment.py		experiment.py
main.py		main.py
plotter.py		plotter.py
prepare_datasets.ps1		prepare_datasets.ps1
prepare_datasets.sh		prepare_datasets.sh

nvan21/Traffic-Sign-Classification

Folders and files

Latest commit

History

Repository files navigation

Traffic Sign Recognition with Machine Learning

Dataset

Project Overview

Final Report

Project Structure

Key Features

Custom CNN Architecture

Experimental Results

Accuracy

Inference Time

Key Findings

Setup Instructions

Environment Setup

Dataset Preparation

Usage

Running the Experiments

Visualizing Results

Conclusion

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages