This project implements a Conversational AI Assistant designed to help employees understand company policies by referencing a company-provided PDF document. It uses LangChain, FAISS, and HuggingFace embeddings for PDF content retrieval and a custom GroqLLM for dynamic query answering.
The assistant can process complex queries, maintain conversational context, and provide precise responses based on the given document.
- PDF Processing: Extract and clean content from PDF documents using
PyPDFLoader. - Conversational Context: Retain query history for follow-up and context-aware responses.
- Embeddings and Search: Use FAISS and HuggingFace Sentence Transformers for semantic search.
- Custom LLM Integration: Utilize GroqLLM API for natural language generation.
- Error Handling: Implements retry logic for API rate limits and network errors.
- Open the project notebook in Google Colab: Run on Google Colab.
- Upload your PDF file in the notebook when prompted.
- Follow the notebook instructions to run each cell sequentially.
- Ask questions about the uploaded PDF and get precise, context-aware answers.
No installation or setup is required besides having a Google account to access Colab.
- PDF Loading: Loads and splits PDF into manageable chunks for semantic search.
- Embeddings: Converts text into vector embeddings using HuggingFace models.
- Query Retrieval: Retrieves the most relevant chunks using FAISS.
- Custom LLM: Constructs query-specific prompts for GroqLLM to generate precise responses.
- Memory Management: Maintains a buffer of conversational history for dynamic interactions.
Q: What is the notice period required when an employee resigns?
A: Employees are required to serve a 30-day notice period before resignation.
Q: Can the notice period be waived?
A: Yes, under certain circumstances, the notice period may be waived by mutual agreement.
- Multi-PDF Support: Enable the assistant to work with multiple PDFs simultaneously.
- Multilingual Support: Add capabilities to handle documents in various languages.
- Advanced Models: Explore integration with LayoutLMv3 for processing complex document layouts.
- OCR Integration: Improve text extraction for scanned documents.
- Customizable Output: Allow users to configure response formats and styles.
We welcome contributions! Here’s how you can get involved:
- Fork the repository.
- Create a feature branch (
git checkout -b feature-new-feature). - Submit a pull request with a detailed description of changes.
For major changes, please open an issue first to discuss your ideas.
Feel free to explore, customize, and deploy this project! For any queries, reach out via Issues or Discussions.