A voice-controlled accessibility assistant that helps blind users navigate and interact with games through AI-powered screen analysis and voice commands.
This software runs alongside games to provide real-time accessibility support for blind users. Users can:
- Press a key to activate voice input
- Ask to hear all available options on screen
- Select specific UI elements through voice commands
- Receive audio descriptions of game states and menus
The system uses AI to analyze screenshots, understand game interfaces, and provide intelligent voice responses.
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Voice Input │ │ Screen Capture │ │ Voice Response │
│ │ │ │ │ │
│ • Hotkey detect │ │ • Screenshot │ │ • Text-to-Speech│
│ • Speech-to-Text│ │ • Game detection│ │ • Audio output │
│ • Command parse │ │ • Image process │ │ • Response queue│
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
└───────────────────────┼───────────────────────┘
│
┌─────────────────┐
│ AI Agent │
│ │
│ • Image analysis│
│ • UI detection │
│ • Command proc. │
│ • Response gen. │
└─────────────────┘
AGC-software/
├── README.md # This file
├── requirements.txt # Python dependencies
├── setup.py # Package setup
├── .env.example # Environment variables template
├── .gitignore # Git ignore rules
│
├── src/ # Main source code
│ ├── __init__.py
│ ├── main.py # Application entry point
│ ├── config/ # Configuration management
│ ├── voice_input/ # Voice input processing
│ ├── screen_capture/ # Screen capture and analysis
│ ├── ai_agent/ # AI processing and decision making
│ ├── local_server/ # Local server API with FastAPI
│ ├── voice_response/ # Text-to-speech and audio output
│ └── utils/ # Shared utilities
│
├── tests/ # Test suites
│ ├── unit/ # Unit tests
│ ├── integration/ # Integration tests
│ └── fixtures/ # Test data and mock files
│
├── docs/ # Documentation
│ ├── api/ # API documentation
│ ├── architecture/ # System design docs
│ ├── user_guide/ # User documentation
│ └── development/ # Development guides
│
├── scripts/ # Development and deployment scripts
│ ├── setup_dev.py # Development environment setup
│ ├── run_tests.py # Test runner
│ └── build.py # Build scripts
│
├── data/ # Data files
│ ├── models/ # AI models and weights
│ ├── audio/ # Audio samples and templates
│ └── game_configs/ # Game-specific configurations
│
└── hardware/ # Hardware integration
├── pi_audio_controller.py # Raspberry Pi Zero W audio controller
├── setup_pi.sh # Pi setup script
├── README.md # Hardware documentation
├── microchip/ # Microchip code (future)
├── controller/ # Controller firmware (future)
└── specs/ # Hardware specifications
- Python 3.8+
- Microphone access
- Screen capture permissions
- OpenAI API key (or alternative AI service)
# Clone the repository
git clone <repository-url>
cd AGC-software
# Install dependencies
pip install -r requirements.txt
# Copy environment template
cp .env.example .env
# Edit .env with your API keys and settings
nano .env
# Run setup script
python scripts/setup_dev.py
# Start the application
python src/main.pyTeam Members: 1-2 developers
- Responsibility: Handle hotkey detection, speech-to-text conversion, and command parsing
- Key Files:
hotkey_listener.py- Global hotkey detectionspeech_to_text.py- Convert speech to textcommand_parser.py- Parse user commands
- Dependencies:
pynput,speech_recognition,pyaudio
Team Members: 1-2 developers
- Responsibility: Capture screenshots, detect active games, and preprocess images
- Key Files:
screenshot.py- Take and manage screenshotsgame_detector.py- Identify running gamesimage_processor.py- Image preprocessing for AI
- Dependencies:
pillow,pyautogui,opencv-python
Team Members: 2-3 developers
- Responsibility: Analyze images, understand UI elements, process commands, generate responses
- Key Files:
image_analyzer.py- Analyze screenshots with AIui_detector.py- Detect UI elements and optionscommand_processor.py- Process user commandsresponse_generator.py- Generate appropriate responses
- Dependencies:
openai,transformers,torch,opencv-python
Team Members: 1 developer
- Responsibility: Convert text to speech and manage audio output
- Key Files:
text_to_speech.py- TTS conversionaudio_manager.py- Audio output managementresponse_queue.py- Queue and prioritize responses
- Dependencies:
pyttsx3,pygame
Team Members: 1-2 developers
- Responsibility: Game-specific configurations and optimizations
- Key Files:
game_configs.py- Game-specific settingsui_templates.py- Common UI patternsintegration_manager.py- Manage game integrations
- Dependencies: Game-specific libraries as needed
Team Members: 1 developer
- Responsibility: Host a local FastAPI server that exposes AGC functions for external devices to query or control
- Key Files:
app.py- FastAPI application with audio processing and health endpointsREADME.md- API documentation and usage examples
- Endpoints:
POST /audio/process- Upload audio file and receive analysis resultsGET /health- Server status and monitoring
- Dependencies:
fastapi,uvicorn,python-multipart - Entry:
python -m src.local_server.apporuvicorn src.local_server.app:app
Team Members: 1-2 developers
- Responsibility: Raspberry Pi Zero W controller for physical voice input
- Key Files:
pi_audio_controller.py- Main Pi controller with GPIO button and LEDsetup_pi.sh- Automated Pi setup scriptREADME.md- Hardware documentation and wiring guide
- Features:
- Button-triggered audio recording
- LED status feedback
- HTTP communication with local server
- Continuous mode operation
- Dependencies:
RPi.GPIO,pyaudio,requests - Hardware: Raspberry Pi Zero W, push button, LED, USB microphone
- Choose a component to work on
- Read the component's README in its directory
- Set up your development environment
- Create a feature branch:
git checkout -b feature/component-name-feature - Implement your changes with tests
- Submit a pull request
- Follow PEP 8 for Python code
- Write docstrings for all functions and classes
- Include unit tests for new functionality
- Use type hints where appropriate
- Keep functions small and focused
# Run all tests
python scripts/run_tests.py
# Run specific component tests
python -m pytest tests/unit/voice_input/
# Run integration tests
python -m pytest tests/integration/- Aurora
- Rachel Yeung
- Matthew
- Wade Rogers
- Erik Xie
- Mellanie Rodriguez
- Maggie Zhang
- Rojina Adhikari
- Riya Sahu