🤟 ASL Alphabet Recognition Model

A deep learning model for American Sign Language (ASL) alphabet recognition using MobileNetV3Large architecture with transfer learning. This project achieves high accuracy in classifying ASL hand signs for letters A-Z and special characters.

📋 Table of Contents

Overview
Features
Dataset
Model Architecture
Installation
Usage
Training Process
Results
Model Export
Requirements
License

🎯 Overview

This project implements a state-of-the-art deep learning model for recognizing American Sign Language alphabet gestures. The model uses MobileNetV3Large as the base architecture with custom classification layers, trained in two phases:

Phase 1: Training the classifier head with frozen base model
Phase 2: Fine-tuning the entire network with reduced learning rate

The model is optimized for both accuracy and deployment, with support for:

Keras format for training and evaluation
TensorFlow Lite format for mobile and edge device deployment

✨ Features

🧠 Transfer Learning: Leverages pre-trained MobileNetV3Large on ImageNet
🎨 Data Augmentation: Random rotation, zoom, contrast, and brightness adjustments
⚖️ Class Balancing: Automatic class weight calculation for imbalanced datasets
📊 Comprehensive Evaluation: Detailed metrics, confusion matrix, and visualizations
📱 Mobile-Ready: TensorFlow Lite export for on-device inference
🚀 GPU Acceleration: Mixed precision training support for faster training
📈 Learning Rate Scheduling: Adaptive learning rate reduction on plateau

📊 Dataset

The model is trained on the ASL Alphabet Dataset which includes:

26 letters (A-Z)
3 special characters (space, delete, nothing)
Total: 29 classes

Data Split

Training: 70% of the dataset
Validation: 15% of the dataset
Test: 15% of the dataset

Expected dataset structure:

dataset/
├── A/
│   ├── image1.jpg
│   ├── image2.jpg
│   └── ...
├── B/
├── C/
...
├── Z/
├── space/
├── del/
└── nothing/

🏗️ Model Architecture

The model consists of:

Base Model: MobileNetV3Large (pre-trained on ImageNet)
- Input shape: 200x200x3
- Pooling: Global Average Pooling
- Initial state: Frozen (Phase 1)
Custom Head:
- Dropout layer (0.2)
- Dense layer (29 units, softmax activation)
Training Configuration:
- Phase 1: Adam optimizer (lr=0.001), 15 epochs
- Phase 2: Adam optimizer (lr=0.00002), 15 epochs
- Loss: Categorical Crossentropy
- Callbacks: ModelCheckpoint, EarlyStopping, ReduceLROnPlateau

🚀 Installation

Prerequisites

Python 3.8+
TensorFlow 2.x
CUDA-compatible GPU (optional, but recommended)

Install Dependencies

pip install numpy pandas matplotlib seaborn scikit-learn tensorflow

Or install from a requirements file:

pip install -r requirements.txt

💻 Usage

Running the Notebook

Open the notebook:
```
jupyter notebook asl-model.ipynb
```
Update dataset path in the notebook to point to your ASL dataset location
Run all cells to:
- Load and prepare the dataset
- Train the model
- Evaluate performance
- Export models

Using the Trained Model

import tensorflow as tf
import numpy as np
from PIL import Image

# Load the model
model = tf.keras.models.load_model('models/model.keras')

# Load and preprocess image
img = Image.open('test_image.jpg').resize((200, 200))
img_array = np.array(img) / 255.0
img_array = np.expand_dims(img_array, axis=0)

# Make prediction
predictions = model.predict(img_array)
predicted_class = np.argmax(predictions[0])

# Load class names
with open('models/training_set_labels.txt', 'r') as f:
    class_names = [line.strip() for line in f.readlines()]

print(f"Predicted: {class_names[predicted_class]}")
print(f"Confidence: {predictions[0][predicted_class]:.2%}")

🎓 Training Process

The training follows a two-phase approach:

Phase 1: Transfer Learning (15 epochs)

Base model layers are frozen
Only the classification head is trained
Higher learning rate (0.001)
Class weights applied for imbalanced data

Phase 2: Fine-tuning (15 epochs)

All layers are unfrozen
Entire network is fine-tuned
Lower learning rate (0.00002)
Learning rate reduction on plateau

Data Augmentation

Applied during training to improve generalization:

Random rotation (±10%)
Random zoom (±10%)
Random contrast (±20%)
Random brightness (±20%)
Rescaling to [0, 1]

📈 Results

The model achieves high accuracy on the test set with robust performance across all ASL alphabet classes.

Training Outputs

best_model_phase1.keras: Best model from Phase 1
best_model_final.keras: Final best model after Phase 2
training_results.png: Visualization of training metrics
training_history.json: Complete training history
model_metadata.json: Model information and metadata

Visualization

Training plots include:

Training vs Validation Accuracy
Training vs Validation Loss
Final metrics summary

📦 Model Export

The notebook automatically exports models in multiple formats:

1. Keras Format (`.keras`)

Full model with architecture and weights
Use for continued training or Python inference
Location: models/model.keras

2. TensorFlow Lite Format (`.tflite`)

Optimized for mobile and edge devices
Smaller file size with quantization
Location: models/model.tflite

3. Supporting Files

training_set_labels.txt: Class names mapping
model_metadata.json: Model configuration and metrics
training_history.json: Complete training logs

📋 Requirements

numpy>=1.19.0
pandas>=1.2.0
matplotlib>=3.3.0
seaborn>=0.11.0
scikit-learn>=0.24.0
tensorflow>=2.8.0
pillow>=8.0.0

🔧 Configuration

Key hyperparameters that can be adjusted:

BATCH_SIZE = 64          # Batch size for training
IMG_SIZE = (200, 200)    # Input image dimensions
EPOCHS_PHASE1 = 15       # Training epochs for Phase 1
EPOCHS_PHASE2 = 15       # Training epochs for Phase 2
LEARNING_RATE_1 = 0.001  # Phase 1 learning rate
LEARNING_RATE_2 = 0.00002 # Phase 2 learning rate

🎯 Use Cases

Mobile Applications: Real-time ASL recognition on smartphones
Educational Tools: Interactive ASL learning applications
Accessibility Solutions: Communication aids for deaf and hard-of-hearing individuals
Research: Baseline for gesture recognition research

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📄 License

This project is available for educational and research purposes.

🙏 Acknowledgments

ASL Alphabet Dataset on Kaggle
TensorFlow and Keras teams
MobileNetV3 architecture by Google Research

📞 Contact

For questions or feedback, please open an issue on GitHub.

Made with ❤️ for the deaf and hard-of-hearing community

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🤟 ASL Alphabet Recognition Model

📋 Table of Contents

🎯 Overview

✨ Features

📊 Dataset

Data Split

🏗️ Model Architecture

🚀 Installation

Prerequisites

Install Dependencies

💻 Usage

Running the Notebook

Using the Trained Model

🎓 Training Process

Phase 1: Transfer Learning (15 epochs)

Phase 2: Fine-tuning (15 epochs)

Data Augmentation

📈 Results

Training Outputs

Visualization

📦 Model Export

1. Keras Format (`.keras`)

2. TensorFlow Lite Format (`.tflite`)

3. Supporting Files

📋 Requirements

🔧 Configuration

🎯 Use Cases

🤝 Contributing

📄 License

🙏 Acknowledgments

📞 Contact

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

🤟 ASL Alphabet Recognition Model

📋 Table of Contents

🎯 Overview

✨ Features

📊 Dataset

Data Split

🏗️ Model Architecture

🚀 Installation

Prerequisites

Install Dependencies

💻 Usage

Running the Notebook

Using the Trained Model

🎓 Training Process

Phase 1: Transfer Learning (15 epochs)

Phase 2: Fine-tuning (15 epochs)

Data Augmentation

📈 Results

Training Outputs

Visualization

📦 Model Export

1. Keras Format (.keras)

2. TensorFlow Lite Format (.tflite)

3. Supporting Files

📋 Requirements

🔧 Configuration

🎯 Use Cases

🤝 Contributing

📄 License

🙏 Acknowledgments

📞 Contact

1. Keras Format (`.keras`)

2. TensorFlow Lite Format (`.tflite`)