Medical AI Assistant

Project Documentation

📚 Table of Contents

Introduction
Features
Setup & Installation
Architecture Overview
Workflow
Frontend
Appendix

🩺 Introduction

Medical AI Assistant is an interactive, AI-powered chatbot designed to answer medical queries using information extracted from a curated set of medical documents. It leverages state-of-the-art language models, document embeddings, and vector search to deliver accurate, context-aware responses through a user-friendly web interface.

✨ Features

Ingests and indexes medical PDFs for domain-specific knowledge.
Uses advanced embedding models for semantic search.
Retrieves contextually relevant document chunks for each query.
Integrates with large language models (LLMs) for answer generation.
Provides citations and source references for transparency.
Streamlit-based frontend for easy interaction.

⚙️ Setup & Installation

Clone the repository

git clone https://github.com/Sammy8617/Medical_Chatbot.git
cd Medical_Chatbot/Chatbot_architecture

Install dependencies
```
pip install -r requirements.txt
```
Configure environment variables
- Add your HuggingFace API token to .env:
```
HF_TOKEN=your_huggingface_token
```
Prepare data
- Place medical PDF files in the Data/ directory.
Run the application
```
streamlit run frontend.py
```

🏗️ Architecture Overview

Main Components:

Document Loader & Chunker: Loads PDFs and splits text into semantic chunks.
Embedding Model: Generates vector embeddings (e.g., sentence-transformers/all-MiniLM-L6-v2).
Vector Database (FAISS): Stores embeddings for fast similarity search.
Retriever: Finds top relevant chunks for each query.
LLM (e.g., Meta-LLaMA, GPT-2): Generates answers based on retrieved context.
RetrievalQA Chain: Orchestrates retrieval and answer generation.
Frontend: Streamlit app for user interaction.

🔄 Workflow

Document Ingestion:
- PDFs loaded from Data/ using LangChain loaders.
- Text split into ~400 character chunks with overlap.
Embedding & Indexing:
- Chunks embedded using HuggingFace models.
- Embeddings stored in FAISS vector store (vectorstore/db_faiss).
Query Processing:
- User submits a query via the frontend.
- Retriever finds top 5 relevant chunks.
- LLM answers using only retrieved context (custom prompt enforces this).
- Citations and source references included in the response.

🖥️ Frontend

Built with Streamlit for rapid prototyping and deployment.
Features:
- Chat interface for medical queries.
- Displays answers and source citations.
- Maintains session history.

📎 Appendix

Environment setup: Install dependencies via requirements.txt.
Data ingestion: Use provided scripts to index new PDFs.
Token management: Keep your HuggingFace API token secure in .env.
Troubleshooting:
- Ensure your model supports text-generation for LLM tasks.
- Check permissions for private models.
- Use public models like gpt2 for testing.

⁂

Architecture Overview

The system architecture consists of the following key components:

Document Loader and Chunker: Loads medical PDFs and splits the content into chunks.
Embedding Model: Generates dense vector embeddings for each text chunk.
Vector Database (FAISS): Stores the vector embeddings for similarity search.
Retriever (FAISS): Searches the vector database for relevant chunks matching a user query.
Large Language Model (LLM): Meta-LLaMA 3 8B model accessed via HuggingFace endpoint, responsible for generating answers.
RetrievalQA Chain: Orchestrates retrieval of context and LLM response generation with a custom prompt.
Frontend: Streamlit web app that manages user interaction, query submission, and displays results.

Index Configuration & Vector Database

The documents are loaded from a local Data/ directory containing PDF files.
PDF documents are ingested using PyPDFLoader and DirectoryLoader.
Text is split into chunks of 400 characters with 50 characters overlap to maintain context across chunks.
Each chunk is converted into embeddings using the pre-trained model sentence-transformers/all-MiniLM-L6-v2.
The embeddings are stored in a FAISS vector store saved locally at vectorstore/db_faiss.
The vector database supports fast similarity search during query time, retrieving relevant document chunks.

Embedding and Chunking

Document Loading: Utilizes Langchain's PDF loaders for batch loading.
Text Splitting: RecursiveCharacterTextSplitter splits documents for manageable semantic chunks.
Embedding Model: Uses HuggingFace Embeddings based on sentence-transformers/all-MiniLM-L6-v2 configured with normalized embeddings on CPU.
This approach balances chunk size and overlap to preserve semantic coherence in retrieved results.

Retriever and Reranker

The FAISS vector store acts as the retriever by performing approximate nearest neighbor search to find the top 5 most relevant document chunks for a given query.
A RetrievalQA chain integrates the retriever with the LLM.
A custom prompt template ensures the model:
- Only answers based on retrieved context.
- Provides clear and detailed responses.
- Includes citations linked to source documents.
The retriever enforces strict grounding of answers to prevent hallucination or out-of-context responses.

Large Language Model and Answering

The LLM is sourced from the HuggingFace model repo meta-llama/Meta-Llama-3-8B and accessed via API endpoint.
Key LLM configuration:
- Temperature: 0.1 to 0.5 for controlled output variability.
- Max tokens: 512 to limit response length.
The QA chain uses a prompt template requiring the model to answer medical questions strictly from provided context.
The output consists of:
- A well-structured, clear answer paragraph.
- Explicit citations referencing source documents used for answering.

Frontend

Developed in Streamlit for rapid UI development and deployment.
Maintains session state for chat history and current responses.
UI flow:
- User inputs a medical query.
- The backend retrieves documents and generates an answer.
- The retrieved documents and their citations are displayed alongside the answer.
It validates retrieved documents for medical relevance before generating responses, enhancing answer accuracy and appropriateness.

Appendix

Environment setup: Install Python dependencies via requirements.txt or pip.
Running the project: Execute the Streamlit app frontend script to launch the web interface.
Data ingestion: Use the provided script for loading and indexing PDF documents.
Token management: Keep HuggingFace API tokens secure and update .env as needed.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Chatbot_architecture		Chatbot_architecture
Data		Data
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Medical AI Assistant

📚 Table of Contents

🩺 Introduction

✨ Features

⚙️ Setup & Installation

🏗️ Architecture Overview

🔄 Workflow

🖥️ Frontend

📎 Appendix

Architecture Overview

Index Configuration & Vector Database

Embedding and Chunking

Retriever and Reranker

Large Language Model and Answering

Frontend

Appendix

About

Uh oh!

Releases

Packages

Languages

Sammy8617/Medical_Chatbot

Folders and files

Latest commit

History

Repository files navigation

Medical AI Assistant

📚 Table of Contents

🩺 Introduction

✨ Features

⚙️ Setup & Installation

🏗️ Architecture Overview

🔄 Workflow

🖥️ Frontend

📎 Appendix

Architecture Overview

Index Configuration & Vector Database

Embedding and Chunking

Retriever and Reranker

Large Language Model and Answering

Frontend

Appendix

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages