The IoT solution to learning ASL
Devpost »
Anish Susarla
·
Vincent Campanaro
·
Sarthak Dayal
Table of Contents
Imagine a world where learning ASL is as intuitive and enjoyable as playing your favorite video game. No more struggling through ASL coureses only to find out that it's not personalized to your learning needs, goals, and experiences; trying to find a natural signer who can give you feedback on your ASL, or (totally not one of our team members) try and train your own computer vision model to help you self train.
The reality is, ASL learning has often been a complicated task - not customizable, not intuitive, and frankly, not fun. But that's about to change.
Introducing SignSense: the first IoT device that makes ASL learning personalized, interactive, and enjoyable.
SignSense introduces an innovative IoT-powered ASL learning system that combines advanced computer vision, customized learning pathways, and interactive haptic feedback. Here's what it offers:
-
Personalized Learning Experience: After a brief survey asking users about their ASL experience and goals, custom learning pathways are generated, and are composed of lessons and sublessons, each containing specific tasks. For example, a lesson on the ASL alphabet includes sublessons for individual letters, featuring informative animations and practical exercises.
-
Real-time Gesture Recognition: During practice sessions, our computer vision model analyzes users' hand positions through their device's camera, providing instant feedback on sign accuracy.
-
Interactive Haptic Feedback: Users wear a smart glove that lights up specific fingers demonstrating what part of the sign/sign movement they made incorrect, offering tactile guidance for proper sign formation.
-
Adaptive Learning Loop: If a user struggles with a sign, a custom animation demonstrates the correct transition from their attempt to the proper form. Users progress to the next sublesson only after mastering the current one.
-
Gamified Achievement System: Upon completing their learning pathway, users earn a personalized digital trophy, encouraging continued engagement with our extensive lesson database.
This comprehensive system creates an immersive, adaptive, and rewarding ASL learning experience, making sign language acquisition more accessible and enjoyable for users of all levels.
Our solution leverages OpenCV and MediaPipe for real-time hand gesture recognition and sign language interpretation, ensuring high accuracy and efficiency in ASL learning.
Using MediaPipe's hand tracking module, we accurately identify and interpret user hand gestures, precisely tracking finger positions and movements. This allows for:
- Accurate Sign Detection: The system can recognize and evaluate complex ASL signs, even in varying lighting conditions or backgrounds.
- Real-time Feedback: Users receive instant feedback on their sign accuracy, allowing for immediate corrections and improvements.
We've integrated custom-trained machine learning models specifically for ASL recognition, on two different levels:
- MediaPipe for static phrases: For static words/phrases like the alphabet and numbers, MediaPipe and their hand landmarks detection model is sufficient to classify the image as a certain number/letter.
- LSTM for Complex Phrases: For more complicated ASL phrases, we employ a Long Short-Term Memory (LSTM) neural network. This advanced model excels at capturing the temporal dependencies in sign language, allowing for accurate interpretation of complete sentences and complex expressions.
The computer vision system adapts to the user's skill level:
- Beginner-Friendly: For new learners, the system focuses on specific landmarks that are correlated with getting the basic structure of the sign correct.
- Advanced Recognition: As users progress, the system evaluates more nuanced aspects of signing, slowly building up to integrating all 20 landmarks of the MediaPipe model.
Our hardware pipeline creates a seamless interaction between the user's hand gestures and tactile feedback, utilizing computer vision, wireless communication, and a smart glove for an immersive ASL learning experience.
- Computer Vision Analysis: When the user wears the glove and performs an ASL sign, our computer vision system, powered by OpenCV and MediaPipe, captures and analyzes the hand gesture.
- Keypoint Extraction: The system identifies and maps 21 key points on the hand, corresponding to joints and fingertips.
- Gesture Classification: These keypoints are then compared to our database of correct ASL signs to determine accuracy.
We use MQTT (Message Queuing Telemetry Transport) for efficient, real-time communication between our gesture recognition system and the smart glove:
- The gesture recognition results are published to a specific MQTT topic.
- Our wireless Arduino board, integrated into the glove, is subscribed to this topic and receives the data instantly.
The Arduino-powered smart glove provides immediate, tactile feedback based on the gesture analysis:
- LED Feedback: Each finger of the glove is equipped with LEDs. If a finger is in the incorrect position for the intended sign, its corresponding LED lights up red.
- Custom Animations: This LED feedback will also be in the form of a custom animation on our web application.
- User performs an ASL sign wearing the smart glove.
- Computer vision system captures and analyzes the hand gesture.
- Keypoints are extracted and compared to the correct sign.
- Results are sent via MQTT to the Arduino in the glove.
- The Arduino activates specific LEDs based on the received data.
- User receives instant visual and tactile feedback, with incorrect finger positions highlighted in red on the glove.
This precise, finger-specific feedback system allows users to immediately understand which aspects of their sign need adjustment, creating a highly interactive and effective learning environment for ASL.


