Skip to content

OshankAgrawal/Document-chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI-Powered PDF Chatbot (RAG-Based)

Project Overview

This project demonstrates the real-world practical implementation of Generative AI concepts by building an intelligent chatbot that can answer questions directly from uploaded PDF documents.

Using Large Language Models (LLMs), vector embeddings, and retrieval-based question answering, this chatbot allows users to upload a document and interact with it conversationally—without reading the entire content.

What makes this project powerful is its simplicity: the complete solution is built in around 50–60 lines of core code, yet delivers highly accurate and meaningful results.


Problem Statement

Reading large documents such as research papers, legal documents, constitutions, reports, personal documents, financial documents or study material is time-consuming and inefficient when only specific information is required.

There is a need for a system that:

  • Understands large documents

  • Retrieves only relevant content

  • Generates accurate answers in a user-friendly way


Solution

This project solves the problem by combining:

  • Document-based embeddings

  • Similarity search using vector databases

  • LLM-powered natural language responses

Users simply:

  1. Upload a PDF

  2. Ask a question

  3. Receive an accurate, context-aware answer


Core Concepts Used

Large Language Models (LLMs) for natural language understanding

Embeddings to convert text into numerical vectors

Retrieval-Augmented Generation (RAG) for precise answers

Prompt customization for better response control

Pre-trained models used off-the-shelf with minimal tuning


How It Works (Workflow)

  1. PDF is uploaded via Streamlit UI

  2. Text is extracted and split into chunks

  3. Chunks are converted into embeddings using HuggingFace models

  4. FAISS stores embeddings for fast similarity search

  5. Relevant chunks are retrieved based on the question

  6. Gemini LLM generates the final answer using retrieved context


Tech Stack

  • Python

  • Streamlit – Web interface

  • LangChain – LLM orchestration

  • HuggingFace Sentence Transformers – Embeddings

  • FAISS – Vector database

  • Google Gemini (Gemini-Pro) – Answer generation

  • PyPDF2 – PDF processing


Key Features

  • Upload and chat with any PDF

  • No need to read entire documents

  • Fast and accurate responses

  • Minimal code, maximum impact

  • Customizable prompts and parameters

  • Domain-independent (legal, education, finance, research, etc.)


Real-World Use Cases

  • Students querying textbooks or notes

  • Lawyers searching legal documents

  • Journalists analyzing reports

  • Researchers exploring papers

  • Organizations querying internal data

  • Educational institutions building learning assistants


Future Scope

  • Multi-document support

  • Chat history and memory

  • Source citation with answers

  • Domain-specific fine-tuning

  • Deployment for organizational use

  • Voice-based interaction


⚠ License & Usage Notice

This project is licensed under the MIT License – feel free to use, modify, and share with credit.


Author

Oshank Agarwal B.Tech – Artificial Intelligence & Data Science Personal Project


Final Note

This project reflects how modern Generative AI has transformed the way we interact with information. Tasks that once required reading multiple books or documents can now be accomplished by simply asking the right question.

This is not just a chatbot — it is a glimpse into the future of intelligent systems.

Releases

No releases published

Packages

 
 
 

Contributors

Languages