Recall AI is an innovative intelligent memory system that captures user activity through periodic screenshots, extracts text using advanced OCR technology, and applies intelligent filters to remove sensitive information. The system encrypts cleaned text and converts it into vector embeddings for semantic search and contextual recall.
๐ฏ Key Innovation: Users can interact with an integrated large language model (LLM) to ask questions and get meaningful responses based on their specific activities, enabling a context-aware, task-focused conversational experience.
- ๐ธ Continuous Activity Capture - Automated screenshot capture with MSS
- ๐ Advanced OCR Processing - Tesseract & PaddleOCR for text extraction
- ๐ก๏ธ Privacy Protection - Intelligent filtering of sensitive information
- ๐ Data Encryption - AES encryption for stored text data
- ๐ง Vector Embeddings - Semantic search with HuggingFace transformers
- ๐๏ธ Dual Storage Options - FAISS (local) & Qdrant (scalable) vector databases
- ๐ค RAG Implementation - Retrieval-Augmented Generation with Groq LLM
- โก Real-Time Streaming - Async/sync model streaming responses
- ๐ Comprehensive Logging - Detailed activity and error tracking
- ๐ File Watching - Automatic processing with Watchdog
- ๐จ Modern Flutter UI - Beautiful glassmorphism design
- ๐ Dual Themes - Animated dark/light mode switching
- ๐ฌ Enhanced Chat Interface - Markdown support with syntax highlighting
- ๐ค Voice Input - Speech-to-text with Windows integration
- โ๏ธ Settings Management - Persistent configuration storage
- ๐ Backend Switching - Toggle between FAISS/Qdrant
- ๐ก Real-Time Streaming - Live response display
- ๐ช Windows Desktop - Optimized for Windows 10/11
git clone https://github.com/Madhur-Prakash/Recall-AI.git
cd Recall-AI# Navigate to backend
cd backend
# Create virtual environment
python -m venv venv
venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
# Setup environment variables
copy .env.sample .env
# Edit .env with your GROQ_API_KEY# Navigate to frontend
cd ../frontend
# Install Flutter dependencies
flutter pub get
# Build for Windows
flutter build windows --release- Download from Tesseract Wiki
- Install with default settings
- Add to PATH:
C:\Program Files\Tesseract-OCR - Verify:
tesseract --version
cd backend
python recall_ai/src/mss_screen.pyFor FAISS (Local):
python recall_ai/config/gen_vector_embedding.pyFor Qdrant (Cloud):
# Start Qdrant server first
docker run -p 6333:6333 qdrant/qdrant
# Then start processing
python recall_ai/config/quad_gen_vector_embedding.pyuvicorn app:app --reload --host 0.0.0.0 --port 8000cd frontend
flutter run -d windowsflutter build windows --release
# Executable: build/windows/x64/runner/Release/recall_frontend.exe| Endpoint | Method | Description | Vector Store |
|---|---|---|---|
/chat |
GET | Chat with FAISS backend | |
/quad_chat |
GET | Chat with Qdrant backend | |
/docs |
GET | Interactive API documentation | - |
/ |
GET | Health check endpoint | - |
curl "http://localhost:8000/chat?query=What was I working on yesterday?"RecallAI/
โโโ ๐ backend/ # Python FastAPI Backend
โ โโโ ๐ธ images_taken/ # Encrypted screenshot data
โ โโโ ๐๏ธ img_vector_store/ # FAISS vector database
โ โโโ ๐ logs/ # Application logs
โ โโโ ๐ง recall_ai/ # Core application
โ โ โโโ โ๏ธ config/ # Vector processing workers
โ โ โโโ ๐ ๏ธ helpers/ # Utility functions
โ โ โโโ ๐ src/ # Main application logic
โ โ โโโ ๐ข vector_embeddings/ # Embedding processing
โ โโโ ๐ app.py # FastAPI application
โ โโโ ๐ requirements.txt # Python dependencies
โ โโโ ๐ .env.sample # Environment template
โโโ ๐ฑ frontend/ # Flutter Desktop App
โ โโโ ๐ lib/ # Dart source code
โ โ โโโ ๐ models/ # Data models
โ โ โโโ ๐ผ๏ธ screens/ # UI screens
โ โ โโโ ๐ง services/ # Business logic
โ โ โโโ ๐ฏ main.dart # App entry point
โ โโโ ๐ช windows/ # Windows-specific files
โ โโโ ๐ฆ pubspec.yaml # Flutter dependencies
โโโ ๐ README.md # This file
โโโ ๐ LICENSE # MIT License
GROQ_API_KEY="your_groq_api_key_here"
SESSION_SECRET_KEY="your_session_secret"
DEVELOPMENT_ENV="local" # or "docker"- Screenshot Interval: 30 seconds (configurable in
mss_screen.py) - Text File Limit: 34 files before processing (configurable in
gen_vector_embedding.py) - Vector Dimensions: 384 (sentence-transformers/all-MiniLM-L6-v2)
- Retrieval Count: 16 documents per query
- ๐ Theme Mode: Dark/Light with animated transitions
- ๐ค Voice Input: Enable/disable speech recognition
- ๐๏ธ Vector Store: Switch between FAISS/Qdrant
- ๐ Server URL: Configure backend endpoint
- Sensitive Data Filtering: Automatic removal of passwords, API keys, tokens
- Local Processing: OCR and filtering happen locally
- Encrypted Storage: AES encryption for all text data
- No Cloud Dependencies: Can run completely offline (FAISS mode)
- Data Encryption: AES-256 encryption for stored text
- Session Management: Secure session handling
- Input Validation: Comprehensive request validation
- Error Handling: Secure error responses without data leakage
- Async Processing: Non-blocking I/O operations
- Streaming Responses: Real-time LLM output
- Vector Caching: Efficient similarity search
- Batch Processing: Optimized embedding generation
- Memory Management: Automatic cleanup and rotation
- Screenshot Processing: ~2-3 seconds per image
- OCR Extraction: ~1-2 seconds per screenshot
- Vector Search: <100ms for similarity queries
- LLM Response: ~1-3 seconds (depends on Groq API)
# Check Windows microphone permissions
# Settings โ Privacy & Security โ Microphone โ Allow desktop apps
# Verify default microphone
# Settings โ System โ Sound โ Input device
# Enable Windows Speech Recognition
# Settings โ Time & Language โ Speech# Verify Tesseract installation
tesseract --version
# Check PATH environment variable
echo $env:PATH | Select-String "Tesseract"
# Reinstall if needed
# Download from: https://github.com/UB-Mannheim/tesseract/wiki# Check backend server status
curl http://localhost:8000/
# Verify Groq API key
# Check .env file configuration
# Test API connectivity
curl -H "Authorization: Bearer YOUR_API_KEY" https://api.groq.com/openai/v1/models- ๐ Multi-Platform Support - macOS and Linux compatibility
- ๐ Audio Capture - Meeting and call transcription
- ๐ Analytics Dashboard - Activity insights and patterns
- ๐ค Custom Models - Local LLM integration
- ๐ฑ Mobile App - iOS and Android clients
- ๐ Cloud Sync - Cross-device synchronization
- ๐จ UI Themes - Additional theme options
- ๐ Internationalization - Multi-language support
- โก Performance Optimization - Faster processing pipelines
- ๐ง Advanced OCR - Better accuracy with multiple engines
- ๐ Enhanced Search - Semantic and temporal filtering
- ๐ Scalability - Distributed processing support
Contributions are welcome! To contribute:
- ๐ Bug Fixes - Report and fix issues
- โจ New Features - Implement planned enhancements
- ๐ Documentation - Improve guides and examples
- ๐จ UI/UX - Design improvements
- ๐งช Testing - Add test coverage
- ๐ Localization - Add language support
- ๐ด Fork the repository
- ๐ฟ Create a feature branch (
git checkout -b feature/amazing-feature) - ๐พ Commit your changes (
git commit -m 'Add amazing feature') - ๐ค Push to the branch (
git push origin feature/amazing-feature) - ๐ Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.