Skip to content

Developed an intelligent question-answering system using Retrieval-Augmented Generation (RAG) that extracts YouTube video transcripts and enables natural language queries about video content Implemented using LangChain framework, Mistral-7B LLM via Hugging Face API, sentence-transformers for embeddings, and FAISS vector database for semantic search

Notifications You must be signed in to change notification settings

sankar-2002/RAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎥 YouTube Video Q&A with RAG

Ask questions about YouTube videos using Retrieval-Augmented Generation (RAG) powered by Hugging Face models.

✨ Features

  • Smart Q&A: Ask questions about any YouTube video content
  • Multiple URL Formats: Supports full URLs or just video IDs
  • AI-Powered: Uses Mistral-7B and sentence transformers for accurate responses
  • User-Friendly: Clean Streamlit interface with real-time feedback
  • Flexible Configuration: Environment-based settings

Alt text

🚀 Quick Start

1. Manual Setup

# Create environment
conda create -n rag_pipeline python=3.10 -y
conda activate rag_pipeline

# Install dependencies
conda install numpy scipy scikit-learn -y
pip install -r requirements.txt

2. Configure Environment

Create a .env file:

HUGGINGFACE_TOKEN=your_token_here

Get your token from Hugging Face Settings

3. Run Application

streamlit run rag_pipeline.py

🎯 How It Works

  1. Input: Paste a YouTube URL or video ID
  2. Processing: Extracts transcript and creates vector embeddings
  3. Query: Ask your question about the video content
  4. Answer: Get AI-powered responses based on the transcript

📋 Supported URL Formats

  • https://www.youtube.com/watch?v=VIDEO_ID
  • https://youtu.be/VIDEO_ID
  • https://www.youtube.com/embed/VIDEO_ID
  • Just the video ID: VIDEO_ID

⚙️ Configuration

Customize settings in .env:

  • HUGGINGFACE_TOKEN: Your HF authentication token
  • CHUNK_SIZE: Text processing chunk size (default: 800)
  • RETRIEVAL_K: Number of relevant chunks to retrieve (default: 4)
  • MAX_NEW_TOKENS: Maximum response length (default: 512)

🛠️ Technologies Used

  • Frontend: Streamlit
  • LLM: Mistral-7B-Instruct via Hugging Face
  • Embeddings: sentence-transformers/all-MiniLM-L6-v2
  • Vector Store: FAISS
  • Framework: LangChain

🔧 Troubleshooting

Common Issues

  • No transcript available: Some videos don't have captions
  • Token errors: Ensure your HF token has proper permissions
  • Installation issues: Use the automated setup script

System Requirements

  • Python 3.10+
  • 4GB+ RAM recommended
  • Internet connection for model downloads

📝 Example Questions

  • "What are the main topics discussed in this video?"
  • "Can you summarize the key points?"
  • "What examples were given about [specific topic]?"
  • "Who is mentioned in the video?"

Note: This tool only works with videos that have available transcripts/captions.

About

Developed an intelligent question-answering system using Retrieval-Augmented Generation (RAG) that extracts YouTube video transcripts and enables natural language queries about video content Implemented using LangChain framework, Mistral-7B LLM via Hugging Face API, sentence-transformers for embeddings, and FAISS vector database for semantic search

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages