Zendalona AI Chatbot Backend

An intelligent RAG-powered chatbot backend for accessibility solutions

About The Project

This project was developed as part of Google Summer of Code 2025 for Zendalona. It provides a production-ready REST API backend for an AI-powered chatbot that specializes in answering questions about accessibility products and solutions.

The system uses Retrieval-Augmented Generation (RAG) to provide accurate, context-aware responses by combining a vector database of curated knowledge with Google's Gemini LLM.

Key Highlights

Intelligent Responses: Uses RAG with LangChain and Google Gemini for accurate, contextual answers
Dual-Layer Caching: Optimized response times with ChromaDB permanent cache and MongoDB temporary cache
Real-time Streaming: Server-Sent Events (SSE) and WebSocket support for streaming responses
Dynamic Knowledge Base: Enrich the knowledge base through web crawling and PDF uploads
Admin Dashboard Ready: Comprehensive feedback management and cache curation system

Features

Chat System

Complete chat responses with source citations
Real-time response streaming via SSE
WebSocket support for React Native and mobile apps
Session management and tracking
Smart similarity-based caching with dynamic thresholds

Document Indexing

Async web crawling with configurable depth (1-5 levels)
PDF document upload and extraction
Automatic duplicate detection
Collection management (create, list, delete)

Caching Architecture

Layer	Storage	Purpose
Permanent	ChromaDB	Curated Q&A pairs for fast retrieval
Temporary	MongoDB	Auto-expiring (8-day TTL) responses for admin review

Admin Features

User feedback collection and management
Promote responses from temp cache to permanent cache
CSV import/export for cache management
System health monitoring and debugging tools

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Client Request                           │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                     FastAPI Application                         │
│  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐  │
│  │  Chat   │ │Indexing │ │  Cache  │ │Feedback │ │ System  │  │
│  │ Router  │ │ Router  │ │ Router  │ │ Router  │ │ Router  │  │
│  └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘  │
└───────┼──────────┼──────────┼──────────┼──────────┼──────────┘
        │          │          │          │          │
        ▼          ▼          ▼          ▼          ▼
┌─────────────────────────────────────────────────────────────────┐
│                       Utility Layer                             │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐          │
│  │  LangChain   │  │   ChromaDB   │  │   MongoDB    │          │
│  │    Utils     │  │    Utils     │  │    Utils     │          │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘          │
└─────────┼────────────────┼────────────────┼─────────────────────┘
          │                │                │
          ▼                ▼                ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│  Google Gemini  │ │    ChromaDB     │ │     MongoDB     │
│      API        │ │  Vector Store   │ │    Database     │
└─────────────────┘ └─────────────────┘ └─────────────────┘

Query Processing Flow

User Query
    │
    ├─► Cache Check (ChromaDB similarity search)
    │       │
    │       ├─► Match Found → Return cached answer
    │       │
    │       └─► No Match
    │               │
    │               ▼
    │           Retrieve Documents (top-k similar)
    │               │
    │               ▼
    │           Generate Response (Gemini LLM)
    │               │
    │               ▼
    │           Save to Temp Cache
    │
    └─► Return Response to Client

Tech Stack

Category	Technology
Framework	FastAPI 0.115+
LLM	Google Gemini (via google-genai)
RAG Framework	LangChain 0.3+
Vector Database	ChromaDB
Document Database	MongoDB (Motor async driver)
Web Crawling	Crawl4AI
PDF Processing	PyPDF2
Server	Uvicorn

Getting Started

Prerequisites

Python 3.13 or higher
MongoDB instance (local or cloud)
Google Gemini API key

Installation

Clone the repository

git clone https://github.com/zendalona/AI-AGENT-Zendalona.git
cd AI-AGENT-Zendalona

Create a virtual environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies
```
pip install -r requirements.txt
```

Configure environment variables

cp .env.example .env

Edit .env with your configuration:

gemini_api_key=your_gemini_api_key_here
mongodb_uri=mongodb://localhost:27017/
mongodb_database=zendalona
PORT=10000

Run the application
```
python main.py
```
Access the API documentation
- Swagger UI: http://localhost:10000/docs
- ReDoc: http://localhost:10000/redoc

Docker Deployment

# Build the image
docker build -t zendalona-chatbot .

# Run the container
docker run -d \
  -p 10000:8000 \
  -e GEMINI_API_KEY=your_key \
  -e MONGODB_URI=mongodb://host:27017/ \
  --name zendalona-bot \
  zendalona-chatbot

API Reference

Chat Endpoints

Method	Endpoint	Description
`POST`	`/chat`	Get a complete chat response
`POST`	`/chat/stream`	Stream response via SSE
`WS`	`/chat/ws/{session_id}`	WebSocket streaming
`POST`	`/chat/feedback`	Submit user feedback

Indexing Endpoints

Method	Endpoint	Description
`POST`	`/indexing/crawl`	Crawl a website
`POST`	`/indexing/upload-pdf`	Upload and index a PDF
`GET`	`/indexing/collections`	List all collections
`DELETE`	`/indexing/collections/{name}`	Delete a collection

Cache Management

Method	Endpoint	Description
`POST`	`/cache/add`	Add a cache entry
`PUT`	`/cache/update/{id}`	Update a cache entry
`DELETE`	`/cache/{id}`	Delete a cache entry
`GET`	`/cache/export`	Export cache as CSV
`POST`	`/cache/import`	Import cache from CSV

System Endpoints

Method	Endpoint	Description
`GET`	`/system/health`	Health check
`GET`	`/system/info`	System information

For complete API documentation, run the server and visit /docs.

Project Structure

AI-AGENT-Zendalona/
├── main.py                 # Application entry point
├── config.py               # Configuration management
├── requirements.txt        # Python dependencies
├── Dockerfile              # Container configuration
├── .env.example            # Environment template
│
├── routers/                # API endpoint definitions
│   ├── chat.py             # Chat endpoints
│   ├── indexing.py         # Document indexing
│   ├── cache.py            # Cache management
│   ├── temp_cache.py       # Temporary cache
│   ├── feedback.py         # User feedback
│   ├── system.py           # System monitoring
│   └── auth.py             # Authentication
│
├── utils/                  # Core utilities
│   ├── langchain_utils.py  # RAG chain setup
│   ├── chroma_utils.py     # Vector DB operations
│   ├── cache_utils.py      # Cache operations
│   ├── mongo_utils.py      # MongoDB operations
│   └── models.py           # Pydantic schemas
│
├── crawler/                # Web crawling module
│   └── crawler.py          # AsyncWebCrawler
│
└── chroma_db/              # Vector database storage

Configuration

Variable	Description	Default
`gemini_api_key`	Google Gemini API key	Required
`mongodb_uri`	MongoDB connection string	`mongodb://localhost:27017/`
`mongodb_database`	Database name	`zendalona`
`chroma_db_path`	ChromaDB storage path	`./chroma_db`
`PORT`	Server port	`10000`
`crawler_depth`	Web crawling depth (1-5)	`2`
`crawler_max_pages`	Max pages to crawl	`50`
`retrieval_k`	Documents to retrieve	`6`
`retrieval_threshold`	Similarity threshold	`0.7`

Contributing

Contributions are welcome! This project continues to be developed and improved.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
chroma_db		chroma_db
crawler		crawler
logs		logs
routers		routers
utils		utils
.env.example		.env.example
.gitignore		.gitignore
ADMIN_DOCUMENTATION.md		ADMIN_DOCUMENTATION.md
ARCHITECTURE.md		ARCHITECTURE.md
CORE_SNIPPETS.md		CORE_SNIPPETS.md
Dockerfile		Dockerfile
FILE_STRUCTURE.md		FILE_STRUCTURE.md
FRAMEWORK_UPGRADE_PATH.csv		FRAMEWORK_UPGRADE_PATH.csv
KT_FRAMEWORKS_DEPENDENCIES.md		KT_FRAMEWORKS_DEPENDENCIES.md
README.md		README.md
README_DOCUMENTATION.md		README_DOCUMENTATION.md
app.log		app.log
checking.py		checking.py
clear_cache.py		clear_cache.py
clear_temp_cache.py		clear_temp_cache.py
config.py		config.py
debug_csv_import.py		debug_csv_import.py
debug_fastapi_csv_import.py		debug_fastapi_csv_import.py
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run-docker.sh		run-docker.sh
test_complex_crawl.py		test_complex_crawl.py
test_crawl_main.py		test_crawl_main.py
test_crawl_specific.py		test_crawl_specific.py
test_generic_crawl.py		test_generic_crawl.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Zendalona AI Chatbot Backend

About The Project

Key Highlights

Features

Chat System

Document Indexing

Caching Architecture

Admin Features

Architecture

Query Processing Flow

Tech Stack

Getting Started

Prerequisites

Installation

Docker Deployment

API Reference

Chat Endpoints

Indexing Endpoints

Cache Management

System Endpoints

Project Structure

Configuration

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Zendalona AI Chatbot Backend

About The Project

Key Highlights

Features

Chat System

Document Indexing

Caching Architecture

Admin Features

Architecture

Query Processing Flow

Tech Stack

Getting Started

Prerequisites

Installation

Docker Deployment

API Reference

Chat Endpoints

Indexing Endpoints

Cache Management

System Endpoints

Project Structure

Configuration

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages