Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

README.md

Amazon DocumentDB RAG Deep Dive

An in-depth, guided notebook that builds a complete RAG (Retrieval-Augmented Generation) chatbot using Amazon DocumentDB as the single store for both source documents and vector embeddings. Unlike a typical quickstart, this notebook explains why each parameter matters and demonstrates the impact of changing them with targeted examples.

Most RAG tutorials show you the "happy path." This notebook goes deeper:

  • Single-store architecture - vectors and source text live in the same document, eliminating the multi-database sync problem
  • HNSW parameter tuning - measures recall vs. latency across different m, efConstruction, and efSearch values so you can see the tradeoffs
  • Chunk size and overlap experiments - demonstrates what happens when chunks split mid-sentence and how overlap preserves context
  • RAG vs. no-RAG comparison - side-by-side output showing what the LLM gets right (and wrong) without retrieval
  • Production resilience patterns - rate limiting, circuit breakers, and retry logic for Bedrock API calls

Check out the AWS Events YouTube channel for a walkthrough of this demo.

Watch the video

Architecture

Notebook Sections

# Section Purpose
Required Configuration Secret name, AWS region, connection mode
1 Install Dependencies Required Python packages
2 Import Libraries Core imports
3 Resilience Utilities Rate limiter and circuit breaker decorators
4 Connect to DocumentDB Secrets Manager credentials, HNSW index creation
5 Load and Chunk Documents PDFs, blog posts, and doc pages → chunked text
6 Generate Embeddings and Insert Titan embeddings → batch insert into DocumentDB
7 Single-Store vs. Multi-Store Benchmarks the single-collection approach against a multi-collection pattern
8 HNSW Parameter Tuning Recall vs. latency across index configurations
9 Configure LLM and Vector Store Claude Haiku + DocumentDB vector store setup
10 Chunk Overlap - Why It Matters Same document chunked with/without overlap, then queried
11 Prompt Template System prompt for grounded Q&A
12 RAG vs No-RAG Comparison Side-by-side LLM output with and without retrieval
13 Chat Configuration History length and retrieval limits
14 Launch Chatbot Gradio chat interface with caching and resilience
15 Cleanup Close connections

Prerequisites

Sample Data

This notebook ingests PDF documents as its knowledge base. You can use any PDFs, but to reproduce the demo as shown, download:

  1. Amazon DocumentDB Developer Guide (PDF)
  2. Data Modeling with Amazon DocumentDB (PDF)

Place the PDF files in the same directory as the notebook.

Setup

  1. Install dependencies:

    pip install -r requirements.txt
  2. Download the Amazon DocumentDB TLS certificate:

    wget https://truststore.pki.rds.amazonaws.com/global/global-bundle.pem
  3. Download the sample PDFs (see Sample Data above) into this directory.

  4. Open docdb-rag-deep-dive.ipynb and update the Required Configuration cell at the top:

    • secret_name — your Secrets Manager secret name
    • aws_region — your AWS region
    • is_bastion — set to 'y' if connecting through a bastion host
  5. Run the notebook cells sequentially.