Medical AI Assistant is an interactive, AI-powered chatbot designed to answer medical queries using information extracted from a curated set of medical documents. It leverages state-of-the-art language models, document embeddings, and vector search to deliver accurate, context-aware responses through a user-friendly web interface.
- Ingests and indexes medical PDFs for domain-specific knowledge.
- Uses advanced embedding models for semantic search.
- Retrieves contextually relevant document chunks for each query.
- Integrates with large language models (LLMs) for answer generation.
- Provides citations and source references for transparency.
- Streamlit-based frontend for easy interaction.
- Clone the repository
git clone https://github.com/Sammy8617/Medical_Chatbot.git cd Medical_Chatbot/Chatbot_architecture - Install dependencies
pip install -r requirements.txt
- Configure environment variables
- Add your HuggingFace API token to
.env:HF_TOKEN=your_huggingface_token
- Add your HuggingFace API token to
- Prepare data
- Place medical PDF files in the
Data/directory.
- Place medical PDF files in the
- Run the application
streamlit run frontend.py
Main Components:
- Document Loader & Chunker: Loads PDFs and splits text into semantic chunks.
- Embedding Model: Generates vector embeddings (e.g.,
sentence-transformers/all-MiniLM-L6-v2). - Vector Database (FAISS): Stores embeddings for fast similarity search.
- Retriever: Finds top relevant chunks for each query.
- LLM (e.g., Meta-LLaMA, GPT-2): Generates answers based on retrieved context.
- RetrievalQA Chain: Orchestrates retrieval and answer generation.
- Frontend: Streamlit app for user interaction.
- Document Ingestion:
- PDFs loaded from
Data/using LangChain loaders. - Text split into ~400 character chunks with overlap.
- PDFs loaded from
- Embedding & Indexing:
- Chunks embedded using HuggingFace models.
- Embeddings stored in FAISS vector store (
vectorstore/db_faiss).
- Query Processing:
- User submits a query via the frontend.
- Retriever finds top 5 relevant chunks.
- LLM answers using only retrieved context (custom prompt enforces this).
- Citations and source references included in the response.
- Built with Streamlit for rapid prototyping and deployment.
- Features:
- Chat interface for medical queries.
- Displays answers and source citations.
- Maintains session history.
- Environment setup: Install dependencies via
requirements.txt. - Data ingestion: Use provided scripts to index new PDFs.
- Token management: Keep your HuggingFace API token secure in
.env. - Troubleshooting:
- Ensure your model supports
text-generationfor LLM tasks. - Check permissions for private models.
- Use public models like
gpt2for testing.
- Ensure your model supports
⁂
The system architecture consists of the following key components:
- Document Loader and Chunker: Loads medical PDFs and splits the content into chunks.
- Embedding Model: Generates dense vector embeddings for each text chunk.
- Vector Database (FAISS): Stores the vector embeddings for similarity search.
- Retriever (FAISS): Searches the vector database for relevant chunks matching a user query.
- Large Language Model (LLM): Meta-LLaMA 3 8B model accessed via HuggingFace endpoint, responsible for generating answers.
- RetrievalQA Chain: Orchestrates retrieval of context and LLM response generation with a custom prompt.
- Frontend: Streamlit web app that manages user interaction, query submission, and displays results.
- The documents are loaded from a local
Data/directory containing PDF files. - PDF documents are ingested using
PyPDFLoaderandDirectoryLoader. - Text is split into chunks of 400 characters with 50 characters overlap to maintain context across chunks.
- Each chunk is converted into embeddings using the pre-trained model
sentence-transformers/all-MiniLM-L6-v2. - The embeddings are stored in a FAISS vector store saved locally at
vectorstore/db_faiss. - The vector database supports fast similarity search during query time, retrieving relevant document chunks.
- Document Loading: Utilizes Langchain's PDF loaders for batch loading.
- Text Splitting:
RecursiveCharacterTextSplittersplits documents for manageable semantic chunks. - Embedding Model: Uses HuggingFace Embeddings based on
sentence-transformers/all-MiniLM-L6-v2configured with normalized embeddings on CPU. - This approach balances chunk size and overlap to preserve semantic coherence in retrieved results.
- The FAISS vector store acts as the retriever by performing approximate nearest neighbor search to find the top 5 most relevant document chunks for a given query.
- A
RetrievalQAchain integrates the retriever with the LLM. - A custom prompt template ensures the model:
- Only answers based on retrieved context.
- Provides clear and detailed responses.
- Includes citations linked to source documents.
- The retriever enforces strict grounding of answers to prevent hallucination or out-of-context responses.
- The LLM is sourced from the HuggingFace model repo
meta-llama/Meta-Llama-3-8Band accessed via API endpoint. - Key LLM configuration:
- Temperature: 0.1 to 0.5 for controlled output variability.
- Max tokens: 512 to limit response length.
- The QA chain uses a prompt template requiring the model to answer medical questions strictly from provided context.
- The output consists of:
- A well-structured, clear answer paragraph.
- Explicit citations referencing source documents used for answering.
- Developed in Streamlit for rapid UI development and deployment.
- Maintains session state for chat history and current responses.
- UI flow:
- User inputs a medical query.
- The backend retrieves documents and generates an answer.
- The retrieved documents and their citations are displayed alongside the answer.
- It validates retrieved documents for medical relevance before generating responses, enhancing answer accuracy and appropriateness.
- Environment setup: Install Python dependencies via
requirements.txtor pip. - Running the project: Execute the Streamlit app frontend script to launch the web interface.
- Data ingestion: Use the provided script for loading and indexing PDF documents.
- Token management: Keep HuggingFace API tokens secure and update
.envas needed.