VisionWalk - A mobile AI app for the visually impaired, ensuring safety and independence with real-time alerts and text-to-speech technology.
For better understanding and detailed explanations, our report is here.
Navigating crowded streets and public spaces poses significant challenges for individuals with visual impairments. Their inability to detect obstacles, traffic signs, or changes in their environment often leads to a higher risk of accidents and limits their independence in movement. With the advancements in technology, AI-driven solutions have emerged as a potential tool to assist visually impaired individuals in moving safely and effectively.
The VisionWalk project aims to address these challenges by developing an AI-powered navigation assistant specifically designed to support visually impaired users in real-time mobility. Utilizing computer vision techniques and AI models, VisionWalk can identify critical traffic signs, obstacles, and potential hazards along the way, providing timely auditory alerts to guide users.
This application focuses on three primary functions: real-time traffic sign recognition, detection of static and dynamic obstacles, and providing real-time alerts and directions to ensure safe navigation. By utilizing the camera of a mobile device through the app, the system will identify and classify traffic signs such as pedestrian restrictions, crosswalks, and obstacles like lamp posts, vehicles, or construction areas. Additionally, the application will calculate the distance and direction of moving obstacles, issuing corresponding alerts to users.
The system will be optimized for use on popular mobile platforms, ensuring accessibility for users without the need for special hardware. Through this project, we aspire to enhance the independence and safety of visually impaired individuals while contributing to a more inclusive society where AI technology improves the quality of life for vulnerable groups.
First let's take a quick sight at the technologies we used in VisionWalk:
-
Image Processing and AI Models:
- Gemini 1.5 by Google: Detects objects and recognizes traffic signs.
- TensorFlow Lite & PyTorch Mobile: Optimizes model performance for mobile devices.
-
Interaction Technology with AI:
- Google Cloud Text-to-Speech (TTS) and Speech-to-Text (STT): Allows audio alerts for users and the ability to respond verbally.
- Noise Suppression: Reduces background noise using features from Google Cloud Speech.
- Voice Generation: Utilizes AI for voice interactions, similar to Siri or Alexa.
-
Mobile Application Development:
- The app is developed using React Native and integrated with Expo, allowing it to run on both Android and iOS. The use of iOS ensures security and a wide range of functionalities. The Framework FastAPI is used to build the API.
In addition to the main functionality of image processing, our application offers several additional features:
- Q&A Support: Enables users to interact with the system via voice.
- Location Tracking: Assists in determining user position and guides them to their desired destination.
- Activity History: Tracks the user's interactions with the system and stores basic profile information.
The simulation for VisionWalk application can be viewed by the design below:
To set up the VisionWalk application, follow the steps below to set up the environment and install the necessary dependencies.
Make sure to install essential dependencies and have the following installed:
-
Node.js (v14 or above) - Required for React Native development.
-
Expo CLI - For managing the React Native app.
npm install -g expo-cli
You can access this repository or below to learn more about Expo
Try Expo in the Browser • Read the Documentation • Learn more on our blog • Request a feature
git clone https://github.com/PrORain-HCMUS/VisionWalk
cd https://github.com/PrORain-HCMUS/VisionWalkTo experience the VisionWalk application on your mobile device during the demo phase, please follow these steps:
-
Connect to local hotspot: Open
Mobile hotspoton the PC that host server, then connect your Phone (that uses the app) to that network. -
Create new credential: Github's authority doesn't allow us to upload our key in VisionWalkServer/private, so you may need to get your own ones from Google Cloud, for TTS and TTS. Sorry for the unconvenience!
-
Set up Server: If you haven't already, start the server by running:
cd VisionWalkServer/src/vision_api.py
python vision_api.py- Set up Client: Next, start the client by running:
cd VisionWalkClient
npx expo startFinally we would like to introduce the contributors of this repo:
| Contributor | Student ID | Github | |
|---|---|---|---|
| 1 | Dai-Hoa Le | 22120108 |
JustinHoa |
| 2 | Tuong-Bach-Hy Nguyen | 22120455 |
nguyentuongbachhy |
| 3 | Hai_Luu_Danh Lieu | 22120459 |
lhldanh |
| 4 | Hoang-Vu Le | 22120461 |
PrORain-HCMUS |





