RAG Chatbot – Document-Based Information Retrieval

Overview

This project implements a Retrieval-Augmented Generation (RAG) chatbot that can answer questions about your uploaded documents using AI-powered search. It retrieves relevant information from a document-based knowledge base and generates context-aware responses using a Large Language Model (LLM). All processing is performed entirely in-memory for fast, real-time responses.

The chatbot supports PDF, DOCX, TXT, and CSV files as input.

Features

Upload multiple documents and perform AI-powered Q&A.
In-memory vector storage using FAISS for fast document similarity search.
LLM integration with NVIDIA NIM and Gemma-3n-e4b-it for context-aware responses.
Strict rules to prevent hallucination and ensure factual answers.
Chat history support with the ability to clear it anytime.
Supports PDFs, Word documents, text files, and CSVs.

Technologies Used

Python
Streamlit (Web interface)
LangChain (RAG pipeline)
FAISS (In-memory vector store)
NVIDIA NIM + Gemma-3n-e4b-it (LLM inference)
HuggingFace Embeddings (sentence-transformers/all-MiniLM-L6-v2)
Pandas (for CSV processing)
python-docx, pypdf (document parsing)

Getting Started

1.Clone the repository: git clone

2.Install dependencies: pip install -r requirements.txt

3.Set your NVIDIA API key in a .env file: NVIDIA_API_KEY=your_api_key_here

4.Run the Streamlit app: streamlit run app.py

Usage

-> Open the app in your browser.

-> Upload your documents (PDF, DOCX, TXT, CSV).

-> Type your question in the chat input.

-> The chatbot retrieves relevant content from the uploaded documents and generates answers.

-> Clear chat history using the Clear History button in the sidebar if needed.

File Support

-> PDF: Extracts text from pages.

-> DOCX: Extracts text from paragraphs.

-> TXT: Reads plain text files.

-> CSV: Converts rows into readable text format for Q&A.

Future Improvements

-> Support for larger knowledge bases or multiple users.

-> Multi-modal input support (images, tables, etc.).

-> Enhanced document parsing for better accuracy.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
venv		venv
.gitignore		.gitignore
README.md		README.md
app.py		app.py
rag_logic.py		rag_logic.py
requirement.txt		requirement.txt
ui.py		ui.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Chatbot – Document-Based Information Retrieval

Overview

Features

Technologies Used

Getting Started

Usage

File Support

Future Improvements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG Chatbot – Document-Based Information Retrieval

Overview

Features

Technologies Used

Getting Started

Usage

File Support

Future Improvements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages