ChatPDF – RAG Chatbot (LangChain + Pinecone + Groq)

A Retrieval-Augmented Generation (RAG) based chatbot that allows users to upload PDFs, then chat, ask questions, and get summaries directly from the content. Built using LangChain, Pinecone, Google Generative AI embeddings, and Groq SDK, this project demonstrates how to combine modern LLMs with vector search for intelligent document interaction.

Features

The application has following features:

Chat with your documents: Upload a PDF and ask any question about its content.
Document summaries: Automatically generate a short, insightful summary of uploaded PDFs.
RAG-based responses: Uses vector search (Pinecone) + Groq LLMs to provide contextually relevant answers.
Document isolation: Each PDF is indexed separately to ensure accurate, document-specific responses.
Automatic cleanup: Temporary uploaded files are automatically deleted after processing.

Getting Started

Clone this repository to your local machine:

git clone https://github.com/harrismalik98/ChatPDF-RAG.git

Install the dependencies:
```
cd ChatPDF-RAG
npm install
```
Configure environment variables by creating a .env file in the project root:
```
GOOGLE_API_KEY=
PINECONE_API_KEY=
PINECONE_INDEX=
GROQ_API_KEY=
```
Start the development server:
```
npm run dev
```
Access the application in your web browser at http://localhost:3000

Technologies Used

The ChatPDF is a full-stack web application built with the following modern tools and technologies:

Next.js 15: A powerful React framework for building fast, scalable, and SEO-friendly web applications with server-side rendering and API routes.
LangChain: A powerful framework that simplifies building LLM-powered applications by managing document loading, chunking, and retrieval workflows.
Pinecone: A fast and scalable vector database for storing embeddings and performing efficient similarity searches in RAG pipelines.
Groq SDK: A high-performance SDK that powers the chatbot’s reasoning and response generation using large open-source LLMs.
Google Generative AI Embeddings: An advanced embedding model that transforms document text into vector representations for accurate context retrieval.
TailwindCSS: A utility-first CSS framework for creating beautiful and responsive user interfaces.
ShadcnUI: A beautifully designed and customizable component library used to enhance the aesthetics and user experience of the website.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
public		public
src		src
.gitignore		.gitignore
README.md		README.md
components.json		components.json
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChatPDF – RAG Chatbot (LangChain + Pinecone + Groq)

Features

Getting Started

Technologies Used

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

harrismalik98/ChatPDF-RAG

Folders and files

Latest commit

History

Repository files navigation

ChatPDF – RAG Chatbot (LangChain + Pinecone + Groq)

Features

Getting Started

Technologies Used

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages