This AI Trip Planner includes an optional Retrieval-Augmented Generation (RAG) feature that powers the local_agent with curated, real-world local experiences. RAG stays dormant until you opt in, making it perfect for learning step-by-step.
RAG (Retrieval-Augmented Generation) combines:
- Retrieval: Search a database for relevant information
- Augmentation: Add that information to the LLM's context
- Generation: LLM generates responses using both its knowledge and the retrieved data
This pattern is fundamental in production AI systems because it:
- Grounds responses in real, curated data
- Provides citations and sources
- Reduces hallucinations
- Allows updating knowledge without retraining models
Copy backend/.env.example to backend/.env if you haven't already, then:
# Enable RAG feature
ENABLE_RAG=1
# Provide your OpenAI API key (needed for embeddings)
OPENAI_API_KEY=sk-...
# Optional: override the embeddings model (defaults to text-embedding-3-small)
OPENAI_EMBED_MODEL=text-embedding-3-smallcd backend
uvicorn main:app --reload --port 8000On startup, the app will:
- Load 540+ curated local experiences from
backend/data/local_guides.json - Create vector embeddings for semantic search
- Index them into an in-memory vector store
Make a request with specific interests:
curl -X POST http://localhost:8000/plan-trip \
-H "Content-Type: application/json" \
-d '{
"destination": "Tokyo",
"duration": "5 days",
"interests": "food, anime, technology"
}'The local_agent will now retrieve the most relevant local experiences from the database and incorporate them into its recommendations!
User Request: "Tokyo with food, anime interests"
↓
1. Create query embedding (OpenAI text-embedding-3-small)
↓
2. Search vector store for top 3 similar experiences
↓
3. Retrieve: "Tsukiji Market tour", "Studio Ghibli Museum", "Akihabara gaming"
↓
4. Inject retrieved context into local_agent prompt
↓
5. LLM generates response using curated data + its knowledge
The local_agent falls back to its original heuristic responses using the mock local_flavor, local_customs, and hidden_gems tools.
Located at backend/data/local_guides.json, this file contains 540+ curated experiences across 20 cities:
[
{
"city": "Tokyo",
"interests": ["food", "sushi"],
"description": "Join a former Tsukiji auctioneer at Toyosu Market for tuna tastings...",
"source": "https://www.tsukiji.or.jp"
},
{
"city": "Prague",
"interests": ["history", "architecture"],
"description": "Join a historian for a dawn walk along the Royal Route...",
"source": "https://www.prague.eu/en"
}
]Each entry includes:
- city: Destination name
- interests: List of relevant topics (food, art, history, etc.)
- description: Detailed experience description
- source: Citation URL for verification
The RAG implementation demonstrates production-ready error handling:
→ Falls back to keyword matching (simple text search)
→ Falls back to keyword matching
→ Falls back to keyword matching, then to empty results
→ Returns empty results, local_agent uses its mock tools
This teaches students that production systems need multiple fallback layers!
When RAG is enabled and Arize tracing is configured, you'll see:
- Embedding Spans: Shows the embedding model and token count
- Retrieval Spans: Shows the query and number of documents retrieved
- Retrieved Documents: The actual content passed to the LLM
- Similarity Scores: How well each document matched the query
- Metadata: City, interests, and source URLs for each result
This makes debugging RAG systems much easier!
Edit backend/data/local_guides.json:
{
"city": "Paris",
"interests": ["food", "wine"],
"description": "Wine tasting tour in Montmartre...",
"source": "https://example.com"
}Restart the server - embeddings will be regenerated automatically!
Try different models in .env:
# Smaller, faster (default)
OPENAI_EMBED_MODEL=text-embedding-3-small
# Larger, more accurate
OPENAI_EMBED_MODEL=text-embedding-3-large
# Legacy model
OPENAI_EMBED_MODEL=text-embedding-ada-002In backend/main.py, modify the local_agent:
# Retrieve more results
retrieved = GUIDE_RETRIEVER.retrieve(destination, interests, k=5) # was k=3
# Use different search parameters
retriever = self._vectorstore.as_retriever(
search_kwargs={"k": 10, "score_threshold": 0.7}
)Enhance the retriever to filter by city or interests before searching:
# In LocalGuideRetriever.retrieve()
docs = retriever.invoke(
query,
filter={"city": destination} # Only search this city
)Cause: ENABLE_RAG=1 but no OPENAI_API_KEY
Solution: Add your OpenAI API key to .env
Cause: City not in database OR interests don't match
Solution: Check local_guides.json for your destination, or add entries
Cause: Too many embedding requests during startup
Solution: The embeddings are cached in memory. Restart less frequently, or use a smaller dataset for development.
Cause: LangChain instrumentation not configured
Solution: Ensure LangChainInstrumentor().instrument() is called at startup
- LangChain Retrievers: https://python.langchain.com/docs/modules/data_connection/retrievers/
- OpenAI Embeddings: https://platform.openai.com/docs/guides/embeddings
- Vector Stores: https://python.langchain.com/docs/integrations/vectorstores/
- RAG Pattern: https://www.pinecone.io/learn/retrieval-augmented-generation/
To turn off RAG and return to the original behavior:
# In .env
ENABLE_RAG=0Restart the server. The local_agent will use its original mock tools.
Next Steps: Try enabling RAG, make some test requests, and view the traces in Arize to see how retrieval augments the LLM's responses!