Visioniyam is an innovative assistive technology solution designed to empower individuals with mobility impairments by enabling them to control computers using head movements and eye blinks. This comprehensive project combines cutting-edge computer vision technology with modern web development to create an accessible and user-friendly interface for computer control.
- Facial Landmark Detection: Utilizes Google's MediaPipe to track 478 key facial landmarks in real-time
- Head Movement Control: Translate head movements into precise mouse cursor movements
- Eye Blink Detection: Left and right eye blinks trigger left and right mouse clicks
- Real-time Calibration: Customizable sensitivity settings for optimal user experience
- Cross-platform Support: Works on Windows, macOS, and Linux systems
- Modern React Frontend: Responsive web interface with dark/light theme support
- User Authentication: Secure signup/login system with JWT tokens
- Interactive Documentation: Step-by-step guides and tutorials
- Download Management: Easy access to desktop application downloads
- Email Notifications: Welcome emails and password reset functionality
- Standalone Executable: Cross-platform desktop application built with Python
- Real-time Video Feed: Live camera preview with facial landmark visualization
- Calibration System: Guided calibration process for optimal performance
- Customizable Controls: Adjustable sensitivity and movement thresholds
- Framework: Express.js with MongoDB integration
- Authentication: JWT-based authentication with bcrypt password hashing
- Email Service: Nodemailer integration for user communications
- Database: MongoDB with Mongoose ODM
- Security: CORS enabled, input validation, and secure cookie handling
- Framework: React 18 with React Router for navigation
- Styling: Tailwind CSS with custom CSS modules
- Animations: Framer Motion for smooth UI transitions
- Components: Modular component architecture with reusable UI elements
- State Management: React hooks for state management
- Computer Vision: OpenCV and MediaPipe for facial landmark detection
- GUI: Tkinter for desktop interface
- Mouse Control: PyAutoGUI for system-level mouse control
- Build System: cx_Freeze for creating standalone executables
visioniyam/
βββ backend/ # Node.js backend server
β βββ controllers/ # Route controllers (auth, views)
β βββ models/ # Database models (User)
β βββ routes/ # API routes (users, views)
β βββ utils/ # Utility functions (email, error handling)
β βββ views/ # Email templates (Pug)
β βββ configuration/ # Database connection
βββ frontend/ # React.js web application
β βββ src/
β β βββ component/ # React components
β β βββ CSS/ # Stylesheets
β β βββ images/ # Static assets
β β βββ animation/ # Lottie animations
β βββ public/ # Public assets
βββ python/ # Desktop application
β βββ main.py # Main application logic
β βββ setup.py # Build configuration
βββ README.md # Project documentation
- Node.js (v14 or higher)
- Python 3.8+
- MongoDB
- Webcam/camera access
cd backend
npm install
npm startcd frontend
npm install
npm startcd python
pip install -r requirements.txt
python main.py- Visit the live application at https://visioniyam.vercel.app/
- Create an account or sign in
- Download the desktop application
- Follow the setup instructions
- Launch the Visioniyam desktop application
- Click "Calibrate" to set up your facial landmarks
- Follow the on-screen calibration instructions
- Click "Start Capturing" to begin mouse control
- Use head movements to move the cursor
- Blink left eye for left-click, right eye for right-click
- MediaPipe Integration: Real-time facial landmark tracking
- Landmark Mapping: 478 key points mapped to facial features
- Movement Translation: Distance calculations converted to mouse movements
- Blink Detection: Eye landmark distance analysis for click detection
- Head Movement: Horizontal and vertical head movements mapped to cursor position
- Sensitivity Adjustment: Customizable movement thresholds
- Click Detection: Eye blink duration analysis for click events
- Smooth Movement: Interpolated cursor movements for natural feel
- Frontend: Deployed on Vercel
- Backend: Node.js server with MongoDB Atlas
- Domain: https://visioniyam.vercel.app/
- Build System: cx_Freeze for cross-platform executables
- Distribution: Direct download from web application
- Updates: Version management through web interface
We welcome contributions to improve Visioniyam! Please feel free to:
- Report bugs and issues
- Suggest new features
- Submit pull requests
- Improve documentation
This project is open source and available under the MIT License.
Visioniyam was created by Team Kirmada:
- Shaurya Bansal - Full Stack Development
- Karan Manglani - Computer Vision & AI
- Arun Rathaur - Backend Development
Our mission is to bridge the digital divide by making technology accessible to everyone, regardless of physical limitations. Visioniyam represents a step forward in inclusive technology, empowering users to interact with computers through natural facial movements and expressions.
For support, questions, or feedback, please contact us through the web application or create an issue in this repository.
Visioniyam - Empowering accessibility through innovation π