An AI-powered web application designed to assist users with visual, cognitive, and hearing impairments. This suite leverages the Google Gemini API to provide a set of powerful, on-demand tools in an accessible and user-friendly interface.
This application is divided into three main tools:
This tool assists users who may have difficulty with memory, attention, or executive function by breaking down complex tasks into simple, manageable steps.
- Custom Task Generation: Users can describe any task (e.g., "how to bake a cake"), and the AI generates a step-by-step guide.
- Pre-defined Task Library: A list of common tasks is available for immediate selection.
- Visual & Auditory Support: Each step includes a title, a simple description, and an AI-generated photorealistic image. Text-to-speech is available for all text content.
- Interactive Progress Tracking: Users can mark steps as complete, earning points and triggering celebratory confetti animations.
- Task Management: Users can favorite, delete, and view recently accessed tasks.
- Voice Navigation: Control the guide with voice commands like "read step 2," "next step," or "go back."
This tool helps users with various forms of color blindness perceive images more clearly.
- Image Upload: Users can upload any image from their device.
- Color Correction Filters: Select from a range of color vision deficiencies (Protanopia, Deuteranopia, Tritanopia, etc.) and a special "Night Mode" for low-light visibility.
- AI-Powered Adjustment: The Gemini vision model analyzes and adjusts the image's colors to enhance differentiability based on the selected condition.
- AI Image Description: The AI provides a detailed textual description of the newly corrected image's content and prominent colors.
This tool aids in visual communication by converting text or speech into images.
- Text-to-Image Generation: Type a description of anything you want to see.
- Speech-to-Image Generation: Use your voice to describe an image.
- Instant Visuals: The AI generates a high-quality image based on the prompt, providing a quick and effective way to communicate visually.
- Color Inversion: A high-contrast mode can be toggled at any time.
- Brightness Control: Adjust the brightness of the entire application interface.
- Universal Text-to-Speech: "Speak" buttons are available next to most text elements.
- Global Voice Control: A persistent button allows for app-wide voice commands where applicable.
- Frontend: React, TypeScript, Tailwind CSS
- AI Engine: Google Gemini API (
@google/genai)gemini-2.5-flash: Used for generating task steps, parsing user intent, and creating structured JSON.gemini-2.5-flash-image-preview: Used for all image generation and color correction tasks.
- Browser APIs:
- Web Speech API (
SpeechRecognition): For voice command input. - Web Speech API (
SpeechSynthesis): For text-to-speech output.
- Web Speech API (
To run this project locally, follow these steps:
- An active Google AI Studio API key.
- The project files from this repository.
In the project root directory, create a file named .env.local and add your API key:
GEMINI_API_KEY=your_api_key_hereInstall all the required dependencies by running:
npm install
Run the following command to start the local development server:
npm run dev
After starting, the application will be available at the URL shown in your terminal (commonly http://localhost:5173)
/
├── components/ # Reusable React components
│ ├── AccessibilityModeSelection.tsx
│ ├── ColorBlindnessTool.tsx
│ ├── GlobalAccessibilityControls.tsx
│ ├── HearingImpairmentTool.tsx
│ ├── Header.tsx
│ ├── StepCard.tsx
│ ├── TaskGuide.tsx
│ └── TaskSelection.tsx
├── services/ # Modules for external services
│ └── geminiService.ts # All Gemini API calls are centralized here
├── types.ts # TypeScript type definitions
├── App.tsx # Main application component and state management
├── index.html # The entry point of the web application
├── index.tsx # React application bootstrap
└── README.md # This file