AI-Powered YouTube Summarizer and QA Tool with RAG and FAISS

Overview

This project is an AI-powered tool designed to extract, summarize, and answer questions based on YouTube video transcripts. By leveraging advanced technologies like LangChain, FAISS, and Retrieval-Augmented Generation (RAG), the tool provides concise summaries and precise answers to user queries. The system is built with a user-friendly interface using Gradio, making it accessible to a wide range of users.

Key Features

Video Transcript Extraction: Automatically fetches transcripts from YouTube videos.
Summarization: Generates concise summaries of video content.
Question Answering (QA): Answers specific user queries based on the video transcript.
RAG Architecture: Combines retrieval-based methods with generative AI for accurate and context-aware responses.
FAISS Integration: Utilizes Facebook AI Similarity Search (FAISS) for efficient vector storage and similarity search.
User-Friendly Interface: Built with Gradio for an interactive and intuitive user experience.

How It Works

1. Transcript Extraction

The tool uses the youtube-transcript-api to fetch transcripts from YouTube videos. It supports both manually created and auto-generated transcripts, prioritizing the former for better accuracy.

2. Text Processing and Chunking

The fetched transcript is processed and split into manageable chunks using the RecursiveCharacterTextSplitter from LangChain. This ensures that the text is optimized for embedding and retrieval.

3. Embedding and Vectorization

The processed text chunks are converted into embeddings using the GoogleGenerativeAIEmbeddings model. These embeddings are stored in a FAISS index, enabling efficient similarity search.

4. Retrieval-Augmented Generation (RAG)

The system employs a RAG architecture to combine retrieval-based methods with generative AI. When a user asks a question:

Relevant transcript chunks are retrieved from the FAISS index based on the query.
The retrieved context is passed to a language model (LLM) to generate a context-aware response.

5. Summarization

The tool uses a predefined prompt template and an LLM to generate concise summaries of the video content. This makes it easier for users to grasp the main points of lengthy videos.

Technologies Used

LangChain: For building LLM-powered applications.
FAISS: For vector storage and similarity search.
Gradio: For creating an interactive user interface.
Google Generative AI: For embeddings and language model capabilities.
YouTube Transcript API: For fetching video transcripts.

Installation

Clone the repository:

git clone https://github.com/your-repo/yt_summarizer.git
cd yt_summarizer

Create a virtual environment and activate it:

python3 -m venv my_env
source my_env/bin/activate

Install the required dependencies:
```
pip install -r requirements.txt
```

Usage

Option 1: Run Locally

Launch the application:
```
python ytbot_gemini.py
```
Open the Gradio interface in your browser.
Enter the YouTube video URL and choose to either summarize the video or ask a question about it.

Option 2: Run with Docker

Build the Docker image (if not already built):
```
docker build -t yt_summarizer .
```
Run the Docker container:
```
docker run -p 7860:7860 yt_summarizer
```
Open your browser and navigate to http://localhost:7860.

Environment Setup

Before running the application, ensure you create a .env file in the project root directory with the following content:

GOOGLE_API_KEY=your-google-api-key

Replace your-google-api-key with your actual Google API key for the Gemini model.

Example Workflow

Summarization:
- Input: YouTube video URL.
- Output: A concise summary of the video content.
Question Answering:
- Input: YouTube video URL and a specific question.
- Output: A detailed answer based on the video transcript.

Sample Output

Architecture Diagram

YouTube Video --> Transcript Extraction --> Text Processing --> Embedding --> FAISS Index --> RAG --> Summary/Answer

Benefits

Saves time by automating transcript analysis.
Provides accurate and context-aware answers to user queries.
Makes video content more accessible and insightful.

Future Enhancements

Support for multilingual transcripts and queries.
Integration with other video platforms.
Advanced analytics and visualization for video content.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
img		img
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt
ytbot.py		ytbot.py
ytbot_gemini.py		ytbot_gemini.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI-Powered YouTube Summarizer and QA Tool with RAG and FAISS

Overview

Key Features

How It Works

1. Transcript Extraction

2. Text Processing and Chunking

3. Embedding and Vectorization

4. Retrieval-Augmented Generation (RAG)

5. Summarization

Technologies Used

Installation

Usage

Option 1: Run Locally

Option 2: Run with Docker

Environment Setup

Example Workflow

Sample Output

Architecture Diagram

Benefits

Future Enhancements

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI-Powered YouTube Summarizer and QA Tool with RAG and FAISS

Overview

Key Features

How It Works

1. Transcript Extraction

2. Text Processing and Chunking

3. Embedding and Vectorization

4. Retrieval-Augmented Generation (RAG)

5. Summarization

Technologies Used

Installation

Usage

Option 1: Run Locally

Option 2: Run with Docker

Environment Setup

Example Workflow

Sample Output

Architecture Diagram

Benefits

Future Enhancements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages