Skip to content

Bladeyboy54/AudioLens

Repository files navigation

AudioLens

Bladen Lehnberg
221146
Interactive Development 300
AudioLens

Audio Lens

AudioLens is an accessibility-focused mobile app designed for the visually impaired.

Explore the docs »

View Demo · Report Bug · Request Feature


Table of Contents
  1. About The Project
  2. Getting Started
  3. Features
  4. Contributors
  5. License

About the Project

AudioLens is a mobile app that empowers visually impaired users by transforming visual text into audio in real-time. Utilizing Google Cloud Vision for Optical Character Recognition (OCR) and Google’s Text-to-Speech (TTS) API, AudioLens allows users to take a photo or select an image to detect text, which is then read aloud for an accessible experience. The app provides an intuitive and accessible interface with large, easy-to-read buttons and high-contrast colors to accommodate users with low vision. It starts with a live camera, making it quick and easy for users to capture text, whether it's on a menu, a sign, or any document. AudioLens is an essential tool for anyone seeking assistance with reading text in their daily life, providing seamless navigation, simple controls, and a commitment to accessibility.

(back to top)

Built With

React Native Expo NodeJS Typescript Javascript Google Cloud

Getting Started

Prerequisites

  • VS Code
  • NodeJS

Installation

Frontend Installation

  1. Clone the frontend repo
    https://github.com/Bladeyboy54/AudioLens.git
  2. Install the node modules for React Native
    cd audiolens
    then
    npm i
  3. Create a file called .env
  4. In the .env file add you Google Cloud SKD API Key
    API_KEY=""
  5. Start the application in your IDE Terminal
    npm start

(back to top)

Features

1. AI Intergration

  • Real-time Text Recognition: Starts with a live camera feed to capture text instantly or allows users to select an image from their gallery for text recognition.
  • Text-to-Speech Conversion: Recognized text is converted into audio using Google’s Text-to-Speech (TTS) API, making it accessible for visually impaired users to hear the content.
  • Cloud Integration for High Accuracy: Google Cloud Vision API enables high-accuracy OCR for diverse text formats, from printed documents to handwritten notes.

2. User-Friendly UI

  • Simple Navigation: Intuitive navigation flow between camera, image preview, and text recognition screens, with a back button to return and retake images as needed.
  • High Accessibility Standards: Easy-to-read button text, large touch targets, and high-contrast color schemes designed with visually impaired users in mind.

3. Manual inputs

  • Supported File Types: Allows image input from both live camera captures and photo gallery selections, providing flexibility in how users capture text.
  • Manual Text-to-Speech Control: Users can manually convert recognized text to speech, giving control over when to hear the content aloud.

User Testing Results

Question Average Rating Notes
Ease of navigation 9/10 Users found the interface intuitive and easy to navigate.
Feature accessibility 5/5 All features were generally easy to locate.
Feature functionality 9/10 Minor improvements suggested for TTS screen layout.
Satisfaction with app experience 8.5/10 UI simplification and accessibility efforts were well-received.
Design (colors, font, layout) 9/10 Positive feedback on color choice for accessibility.
Accessibility for visually impaired users 8/10 Accessibility features praised, with room to enhance text conversion UI.

(back to top)

Contributors

Bladen Lehnberg
Bladen Lehnberg

(back to top)

Licence

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published