This project uses deep learning models to classify Indian Sign Language (ISL) gestures into their corresponding labels. The objective is to recognize ISL gestures using two powerful models: ResNet50 and MobileNetV2. The project leverages transfer learning, using pre-trained models to reduce training time and improve classification accuracy.
Indian_Sign_Langauge_Prediction.ipynbβ Main script for training and evaluationREADME.mdβ Project documentation
- Build a deep learning model for recognizing ISL gestures.
- Use pre-trained models like ResNet50 and MobileNetV2 for transfer learning.
- Evaluate the models based on accuracy, both on training and testing datasets.
- Transfer Learning: Utilizes pre-trained models (ResNet50 and MobileNetV2) to enhance classification performance with fewer training data.
- Data Augmentation: Augments the training data with rotations, shifts, zooms, and flips to make the model more robust.
- Model Evaluation: Evaluates the models on both training and test datasets, reporting accuracy.
- Visualization: Visualizes the training and validation accuracy curves for both models.
- ResNet50 achieved 88.65% accuracy on the test set.
- MobileNetV2 achieved 87.23% accuracy on the test set.
- Both models performed well, with training accuracies above 98%.
- Python 3.x
- TensorFlow 2.x
- Keras
- OpenCV
- Matplotlib
- scikit-learn
- KaggleHub
- Indian Sign Language Dataset: A collection of images of ISL gestures, categorized by letters, numbers, and other common gestures.
Note: The dataset is preprocessed for resizing images, encoding labels, and applying augmentation techniques.
- Clone or download this repository.
- Install the required dependencies:
pip install tensorflow opencv-python matplotlib scikit-learn kagglehub
- Run the
Indian_Sign_Langauge_Prediction.ipynbscript to train the models. - The models will be trained on the dataset and display evaluation results, including training and test accuracy.
- Use the plots generated by
matplotlibto visualize model performance.
- ResNet50 is a deep residual network that uses skip connections to solve the vanishing gradient problem in deep networks. This architecture is well-suited for image classification tasks as it can capture hierarchical features effectively.
- Pre-trained on ImageNet, the model is fine-tuned on the ISL dataset to classify gestures.
- Fine-tuning was performed starting from layer 80 (pre-trained layers frozen).
- MobileNetV2 is an efficient convolutional neural network architecture designed for mobile and edge devices. It uses depthwise separable convolutions to reduce computation while maintaining accuracy.
- The model is pre-trained on ImageNet and fine-tuned on the ISL dataset for gesture classification.
- Fine-tuning was done from layer 50 to allow the model to adapt to the new dataset.
- Sign Language Interpreters: Automate interpretation of sign language in real-time applications.
- Educational Institutions: Develop learning tools for students to learn sign language.
- Tech Companies: Integrate gesture recognition in apps and systems for accessibility.
- Experiment with other models like VGG16 or InceptionV3 to compare performance.
- Implement a real-time sign language recognition system using a webcam.
- Integrate the model into a mobile or web app for broader accessibility.
For queries or collaboration:
GitHub: ritup04
Email: ritupal1626@gmail.com