The General-Purpose Document Q&A System is an AI-powered tool designed to extract and answer questions from diverse document formats. It supports PDFs, DOCX, PPTX, XLSX, CSV, JSON, TXT, and images (PNG/JPG) via OCR, and integrates semantic search, structured data handling, and external link crawling for comprehensive knowledge retrieval.
- 📄 Multi-format Document Processing: Handles PDFs, PowerPoint presentations, Word documents, Images, csv and many more ...
- 🤖 Data Analytics : Answer SQL queries like finding mean, median, mode , then finding the projections
- 🔍 Advanced OCR: Extract text from images embedded within documents
- 🔎 Semantic Search: Find information based on meaning, not just keywords
- 📊 Table & Image Recognition: Process structured data and visual element
- 📝 Source Attribution: Responses include references to source materials
-
Clone the repo
git clone https://github.com/moyrsd/Axiom.git
-
Setup Frontend
cd root/client npm install npm run devAdd your backend url in the .env for eg see the .env.example file
-
Setup Backend
sudo apt install sqlite3 sudo apt install poppler-utils cd root/server python3 -m venv venv source venv/bin/activate pip install -r requirements.txt
Add Google Api key in .env file (See .env.example for reference)
uvicorn main:app --reload
-
Use the application now go to
http://localhost:3000/to use the application. Upload any pdf file and ask question
root/
├── client/
│ ├── src/
│ │ ├── app/
│ │ │ ├── actions/
│ │ │ │ └── delete_tempfile.ts # Deletes temporary files (server-side)
│ │ │ └── page.tsx # Main page component
│ │ ├── components/
│ │ │ ├── Filesidebar.tsx # Sidebar for uploaded files
│ │ │ ├── MessageList.tsx # Displays message responses
│ │ │ └── PromptBox.tsx # User input box
│ │ ├── interface/
│ │ │ └── Interface.ts # TypeScript interfaces
│ │ └── pages/api/
│ │ ├── ask.ts # API for querying
│ │ └── upload.ts # API for uploads
└── server/
├── main.py # Server entry point
├── requirements.txt # Python dependencies
└── src/
├── config/
│ └── constant.py # Server constants
├── document_processing/
│ ├── image_parser.py # Image file processing
│ ├── pdf_parser.py # PDF file processing
│ ├── ppt_parser.py # PPT file processing
│ ├── structured_data_parser.py # Structured data processing
│ ├── text_parser.py # Text file processing
│ └── web_crawl.py # Web crawling
├── prompts/
│ ├── beautify_prompt.py # Beautifies user prompts
│ ├── dataprocessing_prompt.py # Prepares data processing prompts
│ └── ocr_prompt.py # OCR-related prompts
├── routers/
│ ├── ask.py # Routes for querying
│ ├── delete_tempfiles.py # Routes for temp file deletion
│ └── upload.py # Routes for file uploads
└── services/
├── convert_to_json.py # Converts to JSON
├── file_processor.py # File processing workflows
├── llm_calls.py # Interacts with LLMs
├── qa_chain.py # QA pipeline implementation
└── vector_store.py # Vector storage
- FastAPI: High-performance Python framework optimized for building APIs with automatic documentation
- LangChain: Framework for developing applications powered by language models
- ChromaDB: Vector database for efficient similarity search and metadata storage
- Gemini-2.0-Flash: Multimodal LLM for text generation and image understanding
- Vector Embeddings: Semantic representation of document content for intelligent retrieval
- PyMuPDF: Versatile library for parsing PDFs and extracting complex elements
- python-pptx: Advanced toolkit for PowerPoint presentation analysis
- BeautifulSoup & Requests: Web content extraction and formatting tools
- Next.js: React framework for building a responsive and dynamic user interface
When a user uploads a document, the following process occurs:
- The client sends the document to the
/uploadAPI endpoint.Document references are added to the user's sidebar viaFilesidebar.tsx. - The server's
upload.pyrouter receives the file and initiates processing. - Based on the file type, the appropriate parser is selected:
pdf_parser.pyfor PDF documentsppt_parser.pyfor PowerPoint presentationstext_parser.pyfor plain text filesimage_parser.pyfor images requiring OCRstructured_data_parser.pyfor data files (CSV, Excel, etc.)web_crawl.pyfor processing web content
The file_processor.py orchestrates the document processing workflow:
- Text extraction
- Chunking content appropriately with source metadata
After processing:
- The extracted content
vector_store.pygenerates embeddings and stores them in ChromaDB along with metadata.
When a user submits a question:
- The query is sent from
PromptBox.tsxto the/askAPI endpoint. ask.pyrouter processes the request and directs it to the appropriate service.-
qa_chain.pyimplements the RAG pipeline:
- Converts the query to a vector representation.
- Retrieves relevant document chunks from ChromaDB.
- Formats context and query for the LLM.
- Processes the response to include source attribution.
beautify_prompt.pyrefines the user's query for optimal retrieval.
The response is returned to the client, where MessageList.tsx displays it to the user in proper markdown format
- Temporary files are managed through
delete_tempfiles.pyon the server. - Client-side cleanup is handled by
delete_tempfile.tsserver action.
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue. Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feat/AmazingFeature) - Commit your Changes (
git commit -m 'feat: adds some amazing feature') - Push to the Branch (
git push origin feat/AmazingFeature) - Open a Pull Request

