Skip to content

kumar-kiran-24/chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG Chatbot (Website / PDF / Text)

A Streamlit-based Retrieval-Augmented Generation (RAG) chatbot that answers questions from websites, PDFs, or raw text using embeddings, FAISS vector search, and a Groq-powered LLM.


Features

  • Website-based Q&A
  • PDF-based Q&A
  • Text-based Q&A
  • Chat-style UI with history
  • New Chat (session reset)
  • Apply button (runs only on click)
  • No hallucinations (context-only answers)
  • Deployment-ready (Streamlit Cloud / AWS)

How It Works (RAG Flow)

  1. User selects Website / PDF / Text
  2. Content is ingested and split into chunks
  3. Chunks are embedded and stored in FAISS
  4. User asks a question
  5. Relevant chunks are retrieved
  6. LLM generates an answer strictly from context
  7. Answer is shown in Streamlit UI

Project Structure

│
├── app.py # Streamlit UI (Frontend)
├── main.py # Backend Orchestrator
├── requirements.txt
├── README.md
├── .env
├── .gitignore
│
├── embeddings/ # FAISS vector database (auto-created)
├── logs/ # Application logs
├── uploaded_pdfs/ # Uploaded PDF files (optional use)
│
├── src/
│ ├── pycache/
│ │
│ ├── components/
│ │ ├── init.py
│ │ └── ragchatbot.py # RAG + Groq LLM logic
│ │
│ ├── datatransformer/
│ │ ├── init.py
│ │ ├── webdatatransfer.py # Website text extraction
│ │ ├── textdatatransfer.py# Text & PDF text splitting
│ │ └── pdfdatatransfer.py # (Optional PDF logic)
│ │
│ └── utils/
│ ├── init.py
│ ├── dataembedding.py # Embedding creation
│ └── dataingestion.py # Website ingestion logic

Installation & Setup

Clone the Repository

git clone https://github.com/kumar-kiran-24/chatbot
pip install -r requirments.txt

cd chatbot

streamlit run app.py

env file

.env

GROQ_API_KEY=your_groq_api_key_here

for test

https://chatbot-cah6sgfpqmndtmydp7hya4.streamlit.app/