KAI is more than just a chatbot; it is a multimodal emotional companion designed to bridge the gap between human sentiment and artificial intelligence. Built with a focus on empathy, aesthetics, and mental well-being, KAI leverages computer vision and advanced language models to provide a sanctuary for self-reflection and connection.
KAI's architecture is built on a Real-time Asynchronous Hub model, ensuring that visual perception and conversational logic happen simultaneously without lag.
graph TD
subgraph Client_Side [Frontend - Liquid Glass UI]
UI[Web Interface]
CAM[Camera Module]
MIC[Microphone/Text Input]
end
subgraph Backend_Server [Flask + SocketIO Hub]
SRV[Main Server]
EMO[Emotion Engine]
LLM[Logic & Empathy Engine]
TTS[Vocal Synthesis]
end
subgraph Intelligence_Layer [AI Models]
DF[DeepFace & OpenCV]
GM[Gemini 1.5 & OpenRouter]
GT[gTTS / pyttsx3]
end
subgraph Persistence [Data Layer]
DB[(SQLite3 - Diary)]
LOG[(CSV - Mood Logs)]
end
CAM -->|Frame Stream| SRV
SRV --> EMO
EMO --> DF
DF -->|Emotion Vector| SRV
MIC -->|User Prompt| SRV
SRV --> LLM
LLM --> GM
GM -->|Empathetic Response| SRV
SRV --> TTS
TTS --> GT
GT -->|Audio Stream| SRV
SRV --> UI
SRV -.-> DB
SRV -.-> LOG
Understanding how KAI perceives and reacts to you is key to its "soulful" experience.
sequenceDiagram
participant User
participant Frontend
participant EmotionEngine
participant ChatLogic
participant TTS
participant Database
User->>Frontend: Connects to Sanctuary
loop Real-time Perception
Frontend->>EmotionEngine: Stream Video Frame
EmotionEngine->>EmotionEngine: Analyze Facial Landmarks
EmotionEngine-->>Frontend: Update Mood Indicator
end
User->>Frontend: "I've had a long day, Kai."
Frontend->>ChatLogic: Message + [Weighted Emotion Context]
ChatLogic->>ChatLogic: Apply System Instructions (Empathy Layer)
ChatLogic-->>Database: Save interaction (SQLite)
ChatLogic->>ChatLogic: Process with LLM (OpenRouter)
ChatLogic-->>TTS: Convert text to soulful audio
TTS-->>Frontend: Play Response & Show Text
Frontend-->>User: "I'm here for you. Take a breath."
The entryway to your sanctuary. A minimalist, welcoming interface designed to transition the user from the chaos of the digital world into a calm, focused environment.
A personalized "Bento-style" dashboard that visualizes your emotional journey.
- Mood Spectrum: Distribution of your top emotions.
- Glow Gallery: A curated collection of captured moments of happiness (Faceography).
- Activity Sprout: Tracks your daily consistency (Streak) in self-reflection.
The heart of the project. A dedicated chat interface where KAI uses your current visual mood to adjust its tone. KAI doesn't just read; KAI sees.
A persistent journaling system with mood-based templates. Whether you're feeling grateful or overwhelmed, the diary provides the right prompt to help you express yourself.
KAI automatically captures moments when you smile or show genuine joy, storing them in your personal "Glow Gallery" to remind you of your best moments.
| Layer | Technologies |
|---|---|
| Core Backend | Flask, Flask-SocketIO, Eventlet |
| Frontend | Vanilla CSS (Liquid Glass), JavaScript, Jinja2 |
| Intelligence | Gemini 1.5 Flash, OpenRouter (GPT-4o), Google GenAI |
| Vision | OpenCV, DeepFace, TensorFlow |
| Audio/Voice | pyttsx3, gTTS |
| Data | SQLite3, Pandas, CSV |
How KAI stands out in the real-world landscape of AI tools:
| Feature | Standard AI Chatbots | Mood Tracking Apps | KAI: The Companion |
|---|---|---|---|
| Sentiment Analysis | Text-only (Basic) | Manual Entry | Real-time Facial Perception |
| Empathy Level | Informational/Neutral | None | Adaptive Emotional Tone |
| Memory | Session-based | Static History | Persistent Emotional Growth |
| Interaction | Text only | Multiple Choice | Multimodal (Voice + Vision + Text) |
| UI Aesthetics | Utility-focused | Simple/Functional | Liquid Glass / Premium Design |
- Vision-Integrated Empathy: Unlike GPT or Claude, KAI uses your camera feed to detect if you are sad, happy, or angry before you even type a word, adjusting its response accordingly.
- Privacy-First Logging: Data is stored locally in SQLite and CSV, giving the user full control over their emotional history.
- The "Glow" Philosophy: KAI focuses on positive reinforcement through the Joy Gallery, turning AI from a tool into a mental health ally.
- Zero-Latency Interactions: Optimized with SocketIO for instantaneous feedback loops.
- Python 3.9+
- Camera hardware
- Google Gemini API Key
git clone https://github.com/RutujaKumbhar17/KAI-The-Companion.git
cd KAI-The-Companionpip install -r requirements.txtEdit config.py and add your API credentials:
apikey = "YOUR_GEMINI_API_KEY"
model_name = "gemini-1.5-flash"python app.pyAccess the dashboard at http://127.0.0.1:5002
- Multi-User Profiles: Personalized emotional memory for different family members.
- Wearable Integration: Syncing heart rate data (e.g., Apple Watch) for deeper anxiety detection.
- VR Sanctuary: A fully immersive 3D environment for meditation alongside KAI.
- Global Mood map: Anonymous, aggregated mood trends to visualize collective well-being.
- Project Link: https://github.com/RutujaKumbhar17/KAI-The-Companion
- Author: Rutuja Kumbhar
Made with โค๏ธ and โ to bring peace into the digital age.