Internal R&D Prototype | Engineering Team
This prototype explores whether a Retrieval-Augmented Generation (RAG) system can help reduce Support team workload by providing accurate, context-aware answers to common customer inquiries.
The system retrieves relevant information from our internal knowledge base (FAQs and escalation notes) and uses it to generate responses via an LLM.
🔬 Experimental - This is an exploratory prototype for internal evaluation.
⚠️ Note: This project containsTODOstubs that must be implemented before it will run. See Implementation Tasks below.
- Python 3.11+
- OpenAI API key
-
Open this repository in GitHub Codespaces (recommended) or clone locally
-
Install dependencies:
pip install -r requirements.txt
-
Set your OpenAI API key (choose one method):
# Option A: Environment variable export OPENAI_API_KEY="your-api-key-here" # Option B: Create a .env or .env.local file echo 'OPENAI_API_KEY=your-api-key-here' > .env.local
-
Run the evaluation:
python run_eval.py
├── data/
│ ├── faq.json # Public FAQ entries
│ ├── escalation_notes.json # Internal support escalation notes
│ └── eval_questions.json # Evaluation question set with reference answers
├── src/
│ ├── config.py # Configuration settings
│ ├── data_loader.py # Document loading utilities
│ ├── llm_client.py # OpenAI API wrapper
│ ├── embeddings.py # Text embedding functions [EDITABLE]
│ ├── retrieval.py # Document retrieval logic [EDITABLE]
│ └── rag_pipeline.py # Main RAG pipeline [EDITABLE]
├── run_eval.py # Evaluation runner script
└── requirements.txt
The following modules contain TODO markers indicating where implementation is needed:
embed_texts(): Convert text strings to vector embeddingsembed_documents(): Embed all documents in the knowledge base
cosine_similarity(): Compute similarity between query and documentsretrieve(): Find and return the most relevant documents
build_context(): Format retrieved documents into contextbuild_prompt(): Create the LLM prompt with context and questionanswer_question(): Orchestrate the full RAG pipeline
Edit src/config.py to adjust:
| Setting | Default | Description |
|---|---|---|
embedding_model |
text-embedding-3-small |
OpenAI embedding model |
llm_model |
gpt-4o-mini |
LLM for response generation |
top_k |
3 |
Number of documents to retrieve |
retrieval_strategy |
cosine |
Similarity metric |
max_context_length |
4000 |
Maximum context characters |
The evaluation script (run_eval.py) tests the RAG pipeline against 10 predefined questions covering:
- Easy FAQ lookups
- Medium-complexity support scenarios
- Hard edge cases requiring internal knowledge
Reference answers are provided for qualitative comparison.
| Source | Count | Description |
|---|---|---|
| FAQ | 10 | Public customer-facing answers |
| Escalation Notes | 10 | Internal support procedures |
After implementing the core components:
- Run
python run_eval.pyto test the pipeline - Review generated answers against reference answers
- Experiment with configuration changes
- Document findings for team review
AcmeCloud Engineering | Internal Use Only