Skip to content

RecallAI is an AI-powered recall system that captures screen activity, extracts text via OCR, stores vector embeddings, and enables RAG-based querying of past actions.

License

Notifications You must be signed in to change notification settings

Madhur-Prakash/Recall-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

95 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿง  Recall AI

An Advanced FastAPI-Based Intelligent Memory System with Modern Flutter Frontend

Python FastAPI Flutter OCR Vector DB LLM


๐ŸŒŸ Overview

Recall AI is an innovative intelligent memory system that captures user activity through periodic screenshots, extracts text using advanced OCR technology, and applies intelligent filters to remove sensitive information. The system encrypts cleaned text and converts it into vector embeddings for semantic search and contextual recall.

๐ŸŽฏ Key Innovation: Users can interact with an integrated large language model (LLM) to ask questions and get meaningful responses based on their specific activities, enabling a context-aware, task-focused conversational experience.


โœจ Features

๐Ÿ–ฅ๏ธ Backend Capabilities

  • ๐Ÿ“ธ Continuous Activity Capture - Automated screenshot capture with MSS
  • ๐Ÿ” Advanced OCR Processing - Tesseract & PaddleOCR for text extraction
  • ๐Ÿ›ก๏ธ Privacy Protection - Intelligent filtering of sensitive information
  • ๐Ÿ” Data Encryption - AES encryption for stored text data
  • ๐Ÿง  Vector Embeddings - Semantic search with HuggingFace transformers
  • ๐Ÿ—„๏ธ Dual Storage Options - FAISS (local) & Qdrant (scalable) vector databases
  • ๐Ÿค– RAG Implementation - Retrieval-Augmented Generation with Groq LLM
  • โšก Real-Time Streaming - Async/sync model streaming responses
  • ๐Ÿ“Š Comprehensive Logging - Detailed activity and error tracking
  • ๐Ÿ‘€ File Watching - Automatic processing with Watchdog

๐Ÿ“ฑ Frontend Features

  • ๐ŸŽจ Modern Flutter UI - Beautiful glassmorphism design
  • ๐ŸŒ™ Dual Themes - Animated dark/light mode switching
  • ๐Ÿ’ฌ Enhanced Chat Interface - Markdown support with syntax highlighting
  • ๐ŸŽค Voice Input - Speech-to-text with Windows integration
  • โš™๏ธ Settings Management - Persistent configuration storage
  • ๐Ÿ”„ Backend Switching - Toggle between FAISS/Qdrant
  • ๐Ÿ“ก Real-Time Streaming - Live response display
  • ๐ŸชŸ Windows Desktop - Optimized for Windows 10/11

๐Ÿ› ๏ธ Technology Stack

๐Ÿ Backend Technologies

Technology Purpose Version
Python Core Language 3.8+
FastAPI Web Framework Latest
Tesseract OCR Engine 5.0+
OpenCV Image Processing Latest
HuggingFace Embeddings Transformers
FAISS Vector Search CPU
Qdrant Vector Database Latest
Groq LLM Provider API

๐Ÿ“ฑ Frontend Technologies

Technology Purpose Version
Flutter UI Framework 3.10.1+
Dart Programming Language 3.0+
Material Design System 3.0
Windows Target Platform 10/11
Glassmorphism UI Effects -
Speech Voice Input -
Markdown Rich Text Support -
Animations Smooth Animations -

๐Ÿš€ Quick Start

๐Ÿ“‹ Prerequisites

๐Ÿ Backend Requirements

๐Ÿ“ฑ Frontend Requirements

  • Flutter (3.10.1+)
  • Windows
  • Visual Studio

๐Ÿ“ฅ Installation

1๏ธโƒฃ Clone Repository

git clone https://github.com/Madhur-Prakash/Recall-AI.git
cd Recall-AI

2๏ธโƒฃ Backend Setup

# Navigate to backend
cd backend

# Create virtual environment
python -m venv venv
venv\Scripts\activate  # Windows

# Install dependencies
pip install -r requirements.txt

# Setup environment variables
copy .env.sample .env
# Edit .env with your GROQ_API_KEY

3๏ธโƒฃ Frontend Setup

# Navigate to frontend
cd ../frontend

# Install Flutter dependencies
flutter pub get

# Build for Windows
flutter build windows --release

4๏ธโƒฃ Tesseract Installation

  1. Download from Tesseract Wiki
  2. Install with default settings
  3. Add to PATH: C:\Program Files\Tesseract-OCR
  4. Verify: tesseract --version

๐ŸŽฎ Usage

๐Ÿ–ฅ๏ธ Backend Services

1๏ธโƒฃ Start Screen Capture

cd backend
python recall_ai/src/mss_screen.py

2๏ธโƒฃ Start Vector Processing

For FAISS (Local):

python recall_ai/config/gen_vector_embedding.py

For Qdrant (Cloud):

# Start Qdrant server first
docker run -p 6333:6333 qdrant/qdrant

# Then start processing
python recall_ai/config/quad_gen_vector_embedding.py

3๏ธโƒฃ Start API Server

uvicorn app:app --reload --host 0.0.0.0 --port 8000

๐Ÿ“ฑ Frontend Application

๐Ÿš€ Development Mode

cd frontend
flutter run -d windows

๐Ÿ“ฆ Production Build

flutter build windows --release
# Executable: build/windows/x64/runner/Release/recall_frontend.exe

๐ŸŒ API Endpoints

๐Ÿ“ก Chat Endpoints

Endpoint Method Description Vector Store
/chat GET Chat with FAISS backend FAISS
/quad_chat GET Chat with Qdrant backend Qdrant
/docs GET Interactive API documentation -
/ GET Health check endpoint -

๐Ÿ“ Example Request

curl "http://localhost:8000/chat?query=What was I working on yesterday?"

๐Ÿ“ Project Structure

RecallAI/
โ”œโ”€โ”€ ๐Ÿ backend/                     # Python FastAPI Backend
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ธ images_taken/            # Encrypted screenshot data
โ”‚   โ”œโ”€โ”€ ๐Ÿ—„๏ธ img_vector_store/        # FAISS vector database
โ”‚   โ”œโ”€โ”€ ๐Ÿ“Š logs/                    # Application logs
โ”‚   โ”œโ”€โ”€ ๐Ÿง  recall_ai/              # Core application
โ”‚   โ”‚   โ”œโ”€โ”€ โš™๏ธ config/              # Vector processing workers
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿ› ๏ธ helpers/             # Utility functions
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿš€ src/                 # Main application logic
โ”‚   โ”‚   โ””โ”€โ”€ ๐Ÿ”ข vector_embeddings/   # Embedding processing
โ”‚   โ”œโ”€โ”€ ๐ŸŒ app.py                   # FastAPI application
โ”‚   โ”œโ”€โ”€ ๐Ÿ“‹ requirements.txt         # Python dependencies
โ”‚   โ””โ”€โ”€ ๐Ÿ” .env.sample             # Environment template
โ”œโ”€โ”€ ๐Ÿ“ฑ frontend/                    # Flutter Desktop App
โ”‚   โ”œโ”€โ”€ ๐Ÿ“š lib/                     # Dart source code
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿ“„ models/              # Data models
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿ–ผ๏ธ screens/             # UI screens
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿ”ง services/            # Business logic
โ”‚   โ”‚   โ””โ”€โ”€ ๐ŸŽฏ main.dart           # App entry point
โ”‚   โ”œโ”€โ”€ ๐ŸชŸ windows/                 # Windows-specific files
โ”‚   โ””โ”€โ”€ ๐Ÿ“ฆ pubspec.yaml            # Flutter dependencies
โ”œโ”€โ”€ ๐Ÿ“„ README.md                   # This file
โ””โ”€โ”€ ๐Ÿ“œ LICENSE                     # MIT License

โš™๏ธ Configuration

๐Ÿ”ง Backend Settings

๐Ÿ“ Environment Variables (.env)

GROQ_API_KEY="your_groq_api_key_here"
SESSION_SECRET_KEY="your_session_secret"
DEVELOPMENT_ENV="local"  # or "docker"

๐ŸŽ›๏ธ Key Parameters

  • Screenshot Interval: 30 seconds (configurable in mss_screen.py)
  • Text File Limit: 34 files before processing (configurable in gen_vector_embedding.py)
  • Vector Dimensions: 384 (sentence-transformers/all-MiniLM-L6-v2)
  • Retrieval Count: 16 documents per query

๐Ÿ“ฑ Frontend Settings

โš™๏ธ Available Options

  • ๐ŸŒ™ Theme Mode: Dark/Light with animated transitions
  • ๐ŸŽค Voice Input: Enable/disable speech recognition
  • ๐Ÿ—„๏ธ Vector Store: Switch between FAISS/Qdrant
  • ๐ŸŒ Server URL: Configure backend endpoint

๐Ÿ”’ Privacy & Security

๐Ÿ›ก๏ธ Privacy Protection

  • Sensitive Data Filtering: Automatic removal of passwords, API keys, tokens
  • Local Processing: OCR and filtering happen locally
  • Encrypted Storage: AES encryption for all text data
  • No Cloud Dependencies: Can run completely offline (FAISS mode)

๐Ÿ” Security Features

  • Data Encryption: AES-256 encryption for stored text
  • Session Management: Secure session handling
  • Input Validation: Comprehensive request validation
  • Error Handling: Secure error responses without data leakage

๐Ÿš€ Performance

โšก Optimization Features

  • Async Processing: Non-blocking I/O operations
  • Streaming Responses: Real-time LLM output
  • Vector Caching: Efficient similarity search
  • Batch Processing: Optimized embedding generation
  • Memory Management: Automatic cleanup and rotation

๐Ÿ“Š Benchmarks

  • Screenshot Processing: ~2-3 seconds per image
  • OCR Extraction: ~1-2 seconds per screenshot
  • Vector Search: <100ms for similarity queries
  • LLM Response: ~1-3 seconds (depends on Groq API)

๐Ÿ”ง Troubleshooting

๐Ÿ› Common Issues

๐ŸŽค Voice Input Not Working

# Check Windows microphone permissions
# Settings โ†’ Privacy & Security โ†’ Microphone โ†’ Allow desktop apps

# Verify default microphone
# Settings โ†’ System โ†’ Sound โ†’ Input device

# Enable Windows Speech Recognition
# Settings โ†’ Time & Language โ†’ Speech

๐Ÿ” OCR Issues

# Verify Tesseract installation
tesseract --version

# Check PATH environment variable
echo $env:PATH | Select-String "Tesseract"

# Reinstall if needed
# Download from: https://github.com/UB-Mannheim/tesseract/wiki

๐ŸŒ Connection Errors

# Check backend server status
curl http://localhost:8000/

# Verify Groq API key
# Check .env file configuration

# Test API connectivity
curl -H "Authorization: Bearer YOUR_API_KEY" https://api.groq.com/openai/v1/models

๐Ÿ”ฎ Future Enhancements

๐ŸŽฏ Planned Features

  • ๐ŸŒ Multi-Platform Support - macOS and Linux compatibility
  • ๐Ÿ”Š Audio Capture - Meeting and call transcription
  • ๐Ÿ“Š Analytics Dashboard - Activity insights and patterns
  • ๐Ÿค– Custom Models - Local LLM integration
  • ๐Ÿ“ฑ Mobile App - iOS and Android clients
  • ๐Ÿ”„ Cloud Sync - Cross-device synchronization
  • ๐ŸŽจ UI Themes - Additional theme options
  • ๐ŸŒ Internationalization - Multi-language support

๐Ÿ› ๏ธ Technical Improvements

  • โšก Performance Optimization - Faster processing pipelines
  • ๐Ÿง  Advanced OCR - Better accuracy with multiple engines
  • ๐Ÿ” Enhanced Search - Semantic and temporal filtering
  • ๐Ÿ“ˆ Scalability - Distributed processing support

๐Ÿค Contributing

Contributions are welcome! To contribute:

๐ŸŽฏ Areas for Contribution

  • ๐Ÿ› Bug Fixes - Report and fix issues
  • โœจ New Features - Implement planned enhancements
  • ๐Ÿ“š Documentation - Improve guides and examples
  • ๐ŸŽจ UI/UX - Design improvements
  • ๐Ÿงช Testing - Add test coverage
  • ๐ŸŒ Localization - Add language support

๐Ÿ“ Contribution Process

  1. ๐Ÿด Fork the repository
  2. ๐ŸŒฟ Create a feature branch (git checkout -b feature/amazing-feature)
  3. ๐Ÿ’พ Commit your changes (git commit -m 'Add amazing feature')
  4. ๐Ÿ“ค Push to the branch (git push origin feature/amazing-feature)
  5. ๐Ÿ”„ Open a Pull Request

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


Author

Madhur Prakash

GitHub Medium LinkedIn

Building the future of intelligent memory systems ๐Ÿš€


๐ŸŒŸ Star this repository if you found it helpful! ๐ŸŒŸ

Stars Forks Issues License

Made with โค๏ธ and lots of โ˜•

About

RecallAI is an AI-powered recall system that captures screen activity, extracts text via OCR, stores vector embeddings, and enables RAG-based querying of past actions.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published