🤟 Webcam Sign Language Translator

A real-time ASL (American Sign Language) translator that uses your webcam to detect hand gestures and convert them into text — and speech.

✨ Features

Feature	Status
Real-time hand landmark detection (MediaPipe)	✅
ASL A–Z letter classification (MLP)	✅
Letter → Word → Sentence builder	✅
Confidence score bar	✅
Text-to-Speech output	✅
Session history log + export	✅
Reverse mode (text → ASL GIFs)	✅
Accessibility mode (high contrast)	✅
Mobile responsive	✅
Screenshot / download	✅

🗂️ Project Structure

SignLanguage/
├── backend/
│   ├── main.py               # FastAPI server
│   ├── mediapipe_utils.py    # Hand landmark extraction
│   ├── inference.py          # ML prediction logic
│   └── model/
│       ├── train.py          # Training script
│       ├── label_map.json    # Class → letter map
│       └── classifier.pkl    # Trained model (after training)
├── frontend/
│   ├── index.html            # UI
│   ├── style.css             # Dark-mode design
│   ├── app.js                # Webcam + prediction logic
│   ├── tts.js                # Text-to-Speech
│   └── reverse.js            # Text → ASL GIF mode
├── notebooks/
│   └── EDA_and_Training.ipynb
├── dataset/                  # Place Kaggle dataset here
├── requirements.txt
└── README.md

⚡ Quick Start

1. Prerequisites

Python 3.10 or 3.11 (MediaPipe requirement)
A working webcam

2. Install Dependencies

# Create virtual environment
python -m venv venv
.\venv\Scripts\Activate.ps1

# Install packages
pip install -r requirements.txt

3. Download the Dataset (for training)

Download the ASL Alphabet dataset from Kaggle:

URL: https://www.kaggle.com/datasets/grassknoted/asl-alphabet
Extract to: dataset/asl_alphabet_train/ (so you have dataset/asl_alphabet_train/A/, dataset/asl_alphabet_train/B/, etc.)

Or via Kaggle API:

pip install kaggle
# Place your kaggle.json in ~/.kaggle/
kaggle datasets download -d grassknoted/asl-alphabet
Expand-Archive asl-alphabet.zip -DestinationPath dataset/

4. Train the Model

python backend/model/train.py

This scans the dataset, extracts MediaPipe landmarks, trains an MLP classifier, and saves backend/model/classifier.pkl.
Training takes ~2–5 minutes on CPU.

5. Start the Backend

uvicorn backend.main:app --reload --port 8000

6. Open the Frontend

# In a new terminal, serve the frontend:
python -m http.server 3000 --directory frontend

Then open: http://localhost:3000 in your browser.

🎮 Usage

Allow webcam access when prompted
Show your hand and sign any ASL letter (A–Z)
Hold the gesture for ~1 second to confirm the letter
Watch letters build into words in the word panel
Press Spacebar to complete a word
Press Enter to complete a sentence and hear it spoken aloud
Switch to Reverse Mode to type text and see ASL GIFs

📊 Model Performance

Metric	Value
Model	scikit-learn MLP (3 hidden layers)
Features	63 MediaPipe hand landmarks
Training accuracy	~99%
Test accuracy	~95–98%
Inference time	<5ms per frame
Real-time FPS	~10–15 FPS

🔬 ML Architecture

Webcam Frame
    ↓
MediaPipe Hands (21 landmarks × 3 coords = 63 features)
    ↓
Normalize & flatten landmark vector
    ↓
MLP Classifier (hidden: 256→128→64, ReLU, Adam)
    ↓
Top-3 predictions + confidence scores

🛠️ API Endpoints

Endpoint	Method	Description
`/health`	GET	Health check + model status
`/predict`	POST	Send base64 JPEG → get predictions
`/model/info`	GET	Model metadata

Example `/predict` Request

POST /predict
{
  "image": "<base64-encoded JPEG string>"
}

Example Response

{
  "predictions": [
    {"label": "A", "confidence": 0.97},
    {"label": "S", "confidence": 0.02},
    {"label": "T", "confidence": 0.01}
  ],
  "annotated_frame": "<base64-encoded annotated JPEG>",
  "hand_detected": true,
  "landmark_count": 21
}

📦 Dependencies

Package	Purpose
`mediapipe`	Hand landmark detection
`opencv-python`	Image processing
`scikit-learn`	MLP classifier
`fastapi` + `uvicorn`	REST API server
`tensorflow`	CNN training (notebook)
`gtts`	Server-side TTS fallback

🔭 Future Scope

Full sentence-level NLP post-processing
LSTM model for dynamic gestures (J, Z)
Real-time video call plugin (Zoom/Meet)
Indian Sign Language (ISL) support
Federated learning for privacy

👨‍🎓 Academic Submission

Built for academic credits — demonstrates real-world accessibility impact, computer vision, and ML model deployment.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
backend		backend
frontend		frontend
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
check_model.py		check_model.py
requirements.txt		requirements.txt
startup.py		startup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤟 Webcam Sign Language Translator

✨ Features

🗂️ Project Structure

⚡ Quick Start

1. Prerequisites

2. Install Dependencies

3. Download the Dataset (for training)

4. Train the Model

5. Start the Backend

6. Open the Frontend

🎮 Usage

📊 Model Performance

🔬 ML Architecture

🛠️ API Endpoints

Example `/predict` Request

Example Response

📦 Dependencies

🔭 Future Scope

👨‍🎓 Academic Submission

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🤟 Webcam Sign Language Translator

✨ Features

🗂️ Project Structure

⚡ Quick Start

1. Prerequisites

2. Install Dependencies

3. Download the Dataset (for training)

4. Train the Model

5. Start the Backend

6. Open the Frontend

🎮 Usage

📊 Model Performance

🔬 ML Architecture

🛠️ API Endpoints

Example /predict Request

Example Response

📦 Dependencies

🔭 Future Scope

👨‍🎓 Academic Submission

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Example `/predict` Request

Packages