A simple Retrieval Augmented Generation (RAG) application integrating LangChain, OpenAI, and Pinecone vector database, containerized with Docker.
- LangChain: Orchestrates the RAG pipeline
- OpenAI: Provides embeddings (text-embedding-ada-002) and LLM (GPT-3.5-turbo)
- Pinecone: Vector database for semantic search
- Docker: Easy deployment and environment consistency
.
├── src/
│ └── main.py # Main application code
├── data/ # Data directory (for future use)
├── requirements.txt # Python dependencies
├── Dockerfile # Docker container configuration
├── docker-compose.yml # Docker Compose setup
├── .env.example # Environment variables template
└── README.md # This file
- Docker & Docker Compose installed on your system
- OpenAI API Key - Get it from OpenAI Platform
- Pinecone Account - Sign up at Pinecone
cd LangChain&RAG_demoCreate a .env file from the example:
cp .env.example .envEdit .env and add your credentials:
OPENAI_API_KEY=sk-your-actual-openai-api-key
PINECONE_API_KEY=your-actual-pinecone-api-key
PINECONE_ENVIRONMENT=your-pinecone-environment
PINECONE_INDEX_NAME=langchain-demoHow to get these values:
- OpenAI API Key: Go to https://platform.openai.com/api-keys
- Pinecone API Key: Go to https://app.pinecone.io/ → API Keys
- Pinecone Environment: Found in your Pinecone console (e.g.,
us-east-1-aws)
docker-compose up --buildThis will:
- Build the Docker image
- Install all dependencies
- Run the demo application
If you prefer to run locally without Docker:
# Create virtual environment
python -m venv venv
# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On Linux/Mac:
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Run the application
python src/main.pyThe application demonstrates a complete RAG workflow:
- Document Ingestion: Ingests sample documents about AI topics
- Embedding Generation: Creates vector embeddings using OpenAI
- Vector Storage: Stores embeddings in Pinecone
- Semantic Search: Retrieves relevant documents for queries
- Answer Generation: Uses GPT-3.5-turbo to generate answers based on retrieved context
Question: What is Artificial Intelligence?
Answer: Artificial Intelligence (AI) is the simulation of human intelligence
processes by machines, especially computer systems...
User Query
↓
LangChain QA Chain
↓
OpenAI Embeddings (text-embedding-ada-002)
↓
Pinecone Vector Search (semantic similarity)
↓
Retrieved Context Documents
↓
OpenAI LLM (gpt-3.5-turbo)
↓
Generated Answer
Edit src/main.py and modify the sample_documents list:
sample_documents = [
"Your first document text...",
"Your second document text...",
# Add more documents
]Modify the ChatOpenAI initialization in src/main.py:
self.llm = ChatOpenAI(
temperature=0.7,
model="gpt-4", # or "gpt-4-turbo-preview"
openai_api_key=self.openai_api_key
)Modify the retriever configuration:
retriever=self.vectorstore.as_retriever(
search_kwargs={"k": 5} # Retrieve top 5 chunks instead of 3
)- Ensure your Pinecone API key is correct
- Check that you're using a valid environment/region
- Free tier Pinecone accounts have limitations on number of indexes
- Verify your API key is valid
- Check you have credits in your OpenAI account
- Ensure you're not hitting rate limits
- Make sure Docker is running
- Try cleaning Docker cache:
docker system prune -a - Check your internet connection
- OpenAI: Charges for embeddings and LLM API calls
- Pinecone: Free tier available, paid plans for production
- Estimated cost for demo: ~$0.01-0.05 per run
- Load documents from files (PDF, TXT, etc.)
- Add a web interface with Streamlit or FastAPI
- Implement chat history and conversation memory
- Add document metadata filtering
- Deploy to cloud platforms (AWS, GCP, Azure)
MIT License - feel free to use this for learning and projects!