🧠 Recall AI

An Advanced FastAPI-Based Intelligent Memory System with Modern Flutter Frontend

🌟 Overview

Recall AI is an innovative intelligent memory system that captures user activity through periodic screenshots, extracts text using advanced OCR technology, and applies intelligent filters to remove sensitive information. The system encrypts cleaned text and converts it into vector embeddings for semantic search and contextual recall.

🎯 Key Innovation: Users can interact with an integrated large language model (LLM) to ask questions and get meaningful responses based on their specific activities, enabling a context-aware, task-focused conversational experience.

✨ Features

🖥️ Backend Capabilities

📸 Continuous Activity Capture - Automated screenshot capture with MSS
🔍 Advanced OCR Processing - Tesseract & PaddleOCR for text extraction
🛡️ Privacy Protection - Intelligent filtering of sensitive information
🔐 Data Encryption - AES encryption for stored text data
🧠 Vector Embeddings - Semantic search with HuggingFace transformers
🗄️ Dual Storage Options - FAISS (local) & Qdrant (scalable) vector databases
🤖 RAG Implementation - Retrieval-Augmented Generation with Groq LLM
⚡ Real-Time Streaming - Async/sync model streaming responses
📊 Comprehensive Logging - Detailed activity and error tracking
👀 File Watching - Automatic processing with Watchdog

📱 Frontend Features

🎨 Modern Flutter UI - Beautiful glassmorphism design
🌙 Dual Themes - Animated dark/light mode switching
💬 Enhanced Chat Interface - Markdown support with syntax highlighting
🎤 Voice Input - Speech-to-text with Windows integration
⚙️ Settings Management - Persistent configuration storage
🔄 Backend Switching - Toggle between FAISS/Qdrant
📡 Real-Time Streaming - Live response display
🪟 Windows Desktop - Optimized for Windows 10/11

🛠️ Technology Stack

🐍 Backend Technologies

Technology	Purpose	Version
	Core Language	3.8+
	Web Framework	Latest
	OCR Engine	5.0+
	Image Processing	Latest
	Embeddings	Transformers
	Vector Search	CPU
	Vector Database	Latest
	LLM Provider	API

📱 Frontend Technologies

Technology	Purpose	Version
	UI Framework	3.10.1+
	Programming Language	3.0+
	Design System	3.0
	Target Platform	10/11
	UI Effects	-
	Voice Input	-
	Rich Text Support	-
	Smooth Animations	-

🚀 Quick Start

📋 Prerequisites

🐍 Backend Requirements

- Download Here
- Get API Key

📱 Frontend Requirements

(3.10.1+)

📥 Installation

1️⃣ Clone Repository

git clone https://github.com/Madhur-Prakash/Recall-AI.git
cd Recall-AI

2️⃣ Backend Setup

# Navigate to backend
cd backend

# Create virtual environment
python -m venv venv
venv\Scripts\activate  # Windows

# Install dependencies
pip install -r requirements.txt

# Setup environment variables
copy .env.sample .env
# Edit .env with your GROQ_API_KEY

3️⃣ Frontend Setup

# Navigate to frontend
cd ../frontend

# Install Flutter dependencies
flutter pub get

# Build for Windows
flutter build windows --release

4️⃣ Tesseract Installation

Download from Tesseract Wiki
Install with default settings
Add to PATH: C:\Program Files\Tesseract-OCR
Verify: tesseract --version

🎮 Usage

🖥️ Backend Services

1️⃣ Start Screen Capture

cd backend
python recall_ai/src/mss_screen.py

2️⃣ Start Vector Processing

For FAISS (Local):

python recall_ai/config/gen_vector_embedding.py

For Qdrant (Cloud):

# Start Qdrant server first
docker run -p 6333:6333 qdrant/qdrant

# Then start processing
python recall_ai/config/quad_gen_vector_embedding.py

3️⃣ Start API Server

uvicorn app:app --reload --host 0.0.0.0 --port 8000

📱 Frontend Application

🚀 Development Mode

cd frontend
flutter run -d windows

📦 Production Build

flutter build windows --release
# Executable: build/windows/x64/runner/Release/recall_frontend.exe

🌐 API Endpoints

📡 Chat Endpoints

Endpoint	Method	Description	Vector Store
`/chat`	GET	Chat with FAISS backend
`/quad_chat`	GET	Chat with Qdrant backend
`/docs`	GET	Interactive API documentation	-
`/`	GET	Health check endpoint	-

📝 Example Request

curl "http://localhost:8000/chat?query=What was I working on yesterday?"

📁 Project Structure

RecallAI/
├── 🐍 backend/                     # Python FastAPI Backend
│   ├── 📸 images_taken/            # Encrypted screenshot data
│   ├── 🗄️ img_vector_store/        # FAISS vector database
│   ├── 📊 logs/                    # Application logs
│   ├── 🧠 recall_ai/              # Core application
│   │   ├── ⚙️ config/              # Vector processing workers
│   │   ├── 🛠️ helpers/             # Utility functions
│   │   ├── 🚀 src/                 # Main application logic
│   │   └── 🔢 vector_embeddings/   # Embedding processing
│   ├── 🌐 app.py                   # FastAPI application
│   ├── 📋 requirements.txt         # Python dependencies
│   └── 🔐 .env.sample             # Environment template
├── 📱 frontend/                    # Flutter Desktop App
│   ├── 📚 lib/                     # Dart source code
│   │   ├── 📄 models/              # Data models
│   │   ├── 🖼️ screens/             # UI screens
│   │   ├── 🔧 services/            # Business logic
│   │   └── 🎯 main.dart           # App entry point
│   ├── 🪟 windows/                 # Windows-specific files
│   └── 📦 pubspec.yaml            # Flutter dependencies
├── 📄 README.md                   # This file
└── 📜 LICENSE                     # MIT License

⚙️ Configuration

🔧 Backend Settings

📝 Environment Variables (.env)

GROQ_API_KEY="your_groq_api_key_here"
SESSION_SECRET_KEY="your_session_secret"
DEVELOPMENT_ENV="local"  # or "docker"

🎛️ Key Parameters

Screenshot Interval: 30 seconds (configurable in mss_screen.py)
Text File Limit: 34 files before processing (configurable in gen_vector_embedding.py)
Vector Dimensions: 384 (sentence-transformers/all-MiniLM-L6-v2)
Retrieval Count: 16 documents per query

📱 Frontend Settings

⚙️ Available Options

🌙 Theme Mode: Dark/Light with animated transitions
🎤 Voice Input: Enable/disable speech recognition
🗄️ Vector Store: Switch between FAISS/Qdrant
🌐 Server URL: Configure backend endpoint

🔒 Privacy & Security

🛡️ Privacy Protection

Sensitive Data Filtering: Automatic removal of passwords, API keys, tokens
Local Processing: OCR and filtering happen locally
Encrypted Storage: AES encryption for all text data
No Cloud Dependencies: Can run completely offline (FAISS mode)

🔐 Security Features

Data Encryption: AES-256 encryption for stored text
Session Management: Secure session handling
Input Validation: Comprehensive request validation
Error Handling: Secure error responses without data leakage

🚀 Performance

⚡ Optimization Features

Async Processing: Non-blocking I/O operations
Streaming Responses: Real-time LLM output
Vector Caching: Efficient similarity search
Batch Processing: Optimized embedding generation
Memory Management: Automatic cleanup and rotation

📊 Benchmarks

Screenshot Processing: ~2-3 seconds per image
OCR Extraction: ~1-2 seconds per screenshot
Vector Search: <100ms for similarity queries
LLM Response: ~1-3 seconds (depends on Groq API)

🔧 Troubleshooting

🐛 Common Issues

🎤 Voice Input Not Working

# Check Windows microphone permissions
# Settings → Privacy & Security → Microphone → Allow desktop apps

# Verify default microphone
# Settings → System → Sound → Input device

# Enable Windows Speech Recognition
# Settings → Time & Language → Speech

🔍 OCR Issues

# Verify Tesseract installation
tesseract --version

# Check PATH environment variable
echo $env:PATH | Select-String "Tesseract"

# Reinstall if needed
# Download from: https://github.com/UB-Mannheim/tesseract/wiki

🌐 Connection Errors

# Check backend server status
curl http://localhost:8000/

# Verify Groq API key
# Check .env file configuration

# Test API connectivity
curl -H "Authorization: Bearer YOUR_API_KEY" https://api.groq.com/openai/v1/models

🔮 Future Enhancements

🎯 Planned Features

🌐 Multi-Platform Support - macOS and Linux compatibility
🔊 Audio Capture - Meeting and call transcription
📊 Analytics Dashboard - Activity insights and patterns
🤖 Custom Models - Local LLM integration
📱 Mobile App - iOS and Android clients
🔄 Cloud Sync - Cross-device synchronization
🎨 UI Themes - Additional theme options
🌍 Internationalization - Multi-language support

🛠️ Technical Improvements

⚡ Performance Optimization - Faster processing pipelines
🧠 Advanced OCR - Better accuracy with multiple engines
🔍 Enhanced Search - Semantic and temporal filtering
📈 Scalability - Distributed processing support

🤝 Contributing

Contributions are welcome! To contribute:

🎯 Areas for Contribution

🐛 Bug Fixes - Report and fix issues
✨ New Features - Implement planned enhancements
📚 Documentation - Improve guides and examples
🎨 UI/UX - Design improvements
🧪 Testing - Add test coverage
🌍 Localization - Add language support

📝 Contribution Process

🍴 Fork the repository
🌿 Create a feature branch (git checkout -b feature/amazing-feature)
💾 Commit your changes (git commit -m 'Add amazing feature')
📤 Push to the branch (git push origin feature/amazing-feature)
🔄 Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Madhur Prakash

Building the future of intelligent memory systems 🚀

🌟 Star this repository if you found it helpful! 🌟

Made with ❤️ and lots of ☕

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

Madhur-Prakash/Recall-AI

Folders and files

Latest commit

History

Repository files navigation