Skip to content

VisionWalk - A mobile AI app for the visually impaired, ensuring safety and independence with real-time alerts and text-to-speech technology.

Notifications You must be signed in to change notification settings

clc-blind/VisionWalk

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VisionWalk

VisionWalk - A mobile AI app for the visually impaired, ensuring safety and independence with real-time alerts and text-to-speech technology.

For better understanding and detailed explanations, our report is here.

VisionWalk logo

Introduction

Navigating crowded streets and public spaces poses significant challenges for individuals with visual impairments. Their inability to detect obstacles, traffic signs, or changes in their environment often leads to a higher risk of accidents and limits their independence in movement. With the advancements in technology, AI-driven solutions have emerged as a potential tool to assist visually impaired individuals in moving safely and effectively.

The VisionWalk project aims to address these challenges by developing an AI-powered navigation assistant specifically designed to support visually impaired users in real-time mobility. Utilizing computer vision techniques and AI models, VisionWalk can identify critical traffic signs, obstacles, and potential hazards along the way, providing timely auditory alerts to guide users.

This application focuses on three primary functions: real-time traffic sign recognition, detection of static and dynamic obstacles, and providing real-time alerts and directions to ensure safe navigation. By utilizing the camera of a mobile device through the app, the system will identify and classify traffic signs such as pedestrian restrictions, crosswalks, and obstacles like lamp posts, vehicles, or construction areas. Additionally, the application will calculate the distance and direction of moving obstacles, issuing corresponding alerts to users.

The system will be optimized for use on popular mobile platforms, ensuring accessibility for users without the need for special hardware. Through this project, we aspire to enhance the independence and safety of visually impaired individuals while contributing to a more inclusive society where AI technology improves the quality of life for vulnerable groups.

Model and Technologies

First let's take a quick sight at the technologies we used in VisionWalk:

Models and technologies used in VisionWalk
Models and technologies used in VisionWalk

  1. Image Processing and AI Models:

    • Gemini 1.5 by Google: Detects objects and recognizes traffic signs.
    • TensorFlow Lite & PyTorch Mobile: Optimizes model performance for mobile devices.
  2. Interaction Technology with AI:

    • Google Cloud Text-to-Speech (TTS) and Speech-to-Text (STT): Allows audio alerts for users and the ability to respond verbally.
    • Noise Suppression: Reduces background noise using features from Google Cloud Speech.
    • Voice Generation: Utilizes AI for voice interactions, similar to Siri or Alexa.
  3. Mobile Application Development:

    • The app is developed using React Native and integrated with Expo, allowing it to run on both Android and iOS. The use of iOS ensures security and a wide range of functionalities. The Framework FastAPI is used to build the API.

Pipeline and Architecture

Processing pipeline of VisionWalk
Processing pipeline of VisionWalk


Architecture of VisionWalk application
Architecture of the VisionWalk application

GUI and Functions

In addition to the main functionality of image processing, our application offers several additional features:

  • Q&A Support: Enables users to interact with the system via voice.
  • Location Tracking: Assists in determining user position and guides them to their desired destination.
  • Activity History: Tracks the user's interactions with the system and stores basic profile information.

The simulation for VisionWalk application can be viewed by the design below:

VisionWalk's simulation
VisionWalk's simulation

About Installation

To set up the VisionWalk application, follow the steps below to set up the environment and install the necessary dependencies.

Prerequisites

Make sure to install essential dependencies and have the following installed:

  • Node.js (v14 or above) - Required for React Native development.

  • Expo CLI - For managing the React Native app.

    npm install -g expo-cli
    

You can access this repository or below to learn more about Expo

expo sdk

Expo

Expo SDK version Chat or ask a question License: MIT Downloads

Try Expo in the Browser  •  Read the Documentation  •  Learn more on our blog  •  Request a feature

Follow us on

Expo on X   Expo on GitHub   Expo on Reddit   Expo on LinkedIn   Expo on LinkedIn

Clone the Repository

git clone https://github.com/PrORain-HCMUS/VisionWalk
cd https://github.com/PrORain-HCMUS/VisionWalk

Demo Instructions

To experience the VisionWalk application on your mobile device during the demo phase, please follow these steps:

  1. Connect to local hotspot: Open Mobile hotspot on the PC that host server, then connect your Phone (that uses the app) to that network.

  2. Create new credential: Github's authority doesn't allow us to upload our key in VisionWalkServer/private, so you may need to get your own ones from Google Cloud, for TTS and TTS. Sorry for the unconvenience!

  3. Set up Server: If you haven't already, start the server by running:

cd VisionWalkServer/src/vision_api.py
python vision_api.py
  1. Set up Client: Next, start the client by running:
cd VisionWalkClient
npx expo start

About the team:

Finally we would like to introduce the contributors of this repo:

Contributor Student ID Github
1 Dai-Hoa Le 22120108 JustinHoa
2 Tuong-Bach-Hy Nguyen 22120455 nguyentuongbachhy
3 Hai_Luu_Danh Lieu 22120459 lhldanh
4 Hoang-Vu Le 22120461 PrORain-HCMUS

Star the Expo repo on GitHub to support the project

About

VisionWalk - A mobile AI app for the visually impaired, ensuring safety and independence with real-time alerts and text-to-speech technology.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 68.6%
  • Python 31.4%