This project implements a sector-specific intelligent agent focused on Digital Transformation in Businesses.
The goal is to move beyond static question-answering and build a system that retrieves, reasons, and filters knowledge using a Retrieval-Augmented Generation (RAG) approach.
The agent evaluates whether a user question is within the scope of the sectoral knowledge base and retrieves the most relevant contextual information when applicable.
- Design a sectoral AI assistant for the field of Digital Transformation
- Use technical documents instead of fixed answers
- Filter out-of-domain questions
- Provide traceable and explainable retrieval results
This project follows the ReAct Agent design philosophy, focusing on reasoning over retrieved information.
Digital Transformation in Businesses
- Requires strategic, organizational, and technical knowledge
- Highly relevant to modern enterprises
- Suitable for semantic similarity and reasoning-based retrieval
- Data Type: Plain text document
- File:
data/digital_transformation.txt - Content:
Analysis of digital transformation impacts on:- Business strategy
- Organizational culture
- Decision-making processes
- Competitive advantage
- Human factor and employee adaptation
The document is manually curated and used as a domain-specific knowledge base.
The system:
- Splits the document into fixed-size text chunks
- Converts chunks into vector embeddings
- Stores embeddings in memory
- Retrieves the most relevant chunk using cosine similarity
- Rejects out-of-domain questions using a similarity threshold
- Chunk size: 400 words
- Ensures semantic coherence
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2- Supports Turkish language
- Threshold: 0.35
- Prevents hallucination and irrelevant answers
Examples:
- Digital transformation strategy risks
- Organizational resistance
- Impact on competitiveness
- Human factor in transformation
Examples:
- French Revolution
- Quantum computing
Out-of-domain questions are explicitly rejected.
For each user query, the system displays the following information:
- User Question: The original question provided by the user.
- Retrieved Context: The most relevant document chunk retrieved using semantic similarity (if the question is in-scope).
- Similarity Score: A numerical score indicating how closely the question matches the retrieved document content.
- Out-of-Scope Warning: If the similarity score is below the defined threshold, the system notifies that the question is outside the document scope.
Contributions are welcome!
For any queries or collaborations, feel free to reach out!
🌐 GitHub: zeynepcol
👤 LinkedIn: zeynep-col
