🎥 Local Multimodal Video Recommender System

A fully local, privacy-preserving video recommender system + natural language semantic search engine that understands video content using:

Computer Vision (BLIP image captioning + OCR)
Audio transcription (Whisper ASR)
LLM reasoning (LLaMA via Ollama)
Semantic search (ChromaDB + cosine similarity)

The system recommends videos based on what the user actually watches, and allows natural language queries such as:

"Find the videos where someone is cooking."

All processing runs locally — no external API calls.

✅ What it Does

For each video the user watches:

Extracts representative frames (OpenCV)
Generates captions + reads on-screen text (BLIP + OCR)
Transcribes audio (Whisper)
Summarizes the meaning using LLaMA (via Ollama)
Generates an embedding from the meaning (Gemma Embedding model)
Stores the embedding + user preference score in ChromaDB
Recommends new unseen videos based on similarity + watch-time score

Additionally, the system includes an interactive terminal menu, where users can:

📥 Add videos by “watching” them
🎯 Receive personalized recommendations
🔍 Query watched videos using natural language (vector search over ChromaDB)
📊 View user statistics (average watch time, std. deviation, etc.)

🧠 Tech Stack

Component	Technology
Frame extraction	OpenCV
Image captioning	BLIP (HuggingFace)
OCR (detect text in video frames)	Tesseract
Audio → text	Whisper (transformers pipeline)
Video meaning summarization	LLaMA via Ollama
Embeddings	EmbeddingGemma (Ollama)
Vector database	ChromaDB
Recommendation scoring	Cosine similarity + Softmax + user scoring

🔧 Installation

Clone the repo:

git clone https://github.com/<your-user>/<repo-name>.git
cd <repo-name>

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Engine.py		Engine.py
LICENSE		LICENSE
Readme.md		Readme.md
embedd.py		embedd.py
menu.py		menu.py
user.py		user.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎥 Local Multimodal Video Recommender System

✅ What it Does

🧠 Tech Stack

🔧 Installation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎥 Local Multimodal Video Recommender System

✅ What it Does

🧠 Tech Stack

🔧 Installation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages