This is a Document Question Answering (DocQA) system built with Gradio and the Qwen LLMs. It enables users to upload documents (PDF, DOCX, TXT), semantically index them, and ask questions based on the content. The system uses RAG and embedding-based retrieval combined with a generative LLM to provide context-aware answers.
- Embedding-based semantic search with Qwen embeddings
- Question answering with
Qwen/Qwen-7B-Chat - Upload and parse
.pdf,.docx, and.txtfiles - Vector search with FAISS
- Fast, local inference via PyTorch with GPU/CPU support
- Gradio UI for simple interactive testing
git clone https://github.com/LSShrivathsan/RAP_DocQA.git
cd RAP_DocQA
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
python app.py
Gradio will display a URL to access the UI in your browser.
- Component Technology
- Embedding Model Qwen/Qwen3-Embedding-0.6B
- LLM Qwen/Qwen-7B-Chat
- Index faiss.IndexFlatL2
- UI gradio
Document Upload: Users upload files through Gradio UI
Text Extraction: Based on file type using PyPDF2, python-docx, or basic decoding
Chunking: Text split into 256 token chunks
Embedding: Each chunk embedded with Qwen Embedding model
Indexing: FAISS used to build a searchable vector store
User asks a question
Top 3 relevant chunks retrieved
Prompt built with context
Qwen-7B-Chat generates a final answer
Method: Naive chunking using 256 tokens per chunk
Rationale: Simple, fast, and compatible with Qwen embedding model
Future Improvement: Use overlap based chunking (e.g., sliding window)
Vector Search: FAISS with IndexFlatL2
Embedding: Qwen3-Embedding-0.6B (fast and high-quality)
Top-k: Retrieves top 3 chunks based on cosine similarity
Context Window: Selected chunks combined into a prompt for answering
Embedding Model: Lightweight, suitable for CPU or GPU
Chat Model (Qwen-7B-Chat):
Requires GPU with 12 GB VRAM minimum
Memory Usage:
RAM: ~2–3GB for embedding + FAISS index
VRAM: ~10–14GB for model in float16
Inference Speed : 2s - 15s
- Format Parser
- .pdf PyPDF2
- .docx python-docx
- .txt UTF-8 decoding