Skip to content

parsa-hj/Agentic-RAG-Customer-Support-Assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Agentic RAG Customer Support Assistant

Python FastAPI Ollama Frontend

A minimal, production-structured single-agent customer support app using FastAPI + local Ollama (llama3).

What This App Does

  • Accepts natural-language support questions through a chat UI or API.
  • Uses agent logic to decide when to call tools.
  • Pulls structured order data from a mock database.
  • Returns natural-language responses with order status, shipping updates, and order totals.

Tech Stack

  • Backend: FastAPI (Python)
  • AI Runtime: Ollama (llama3, local)
  • Tooling: Python function tools (get_order_status, get_shipping_updates, get_order_total)
  • Memory: Lightweight in-process conversation memory (last 5 messages)
  • Frontend: Static HTML, CSS, and JavaScript

Features

  • POST /chat API for user support queries
  • Single-agent orchestration with local Ollama model (llama3 by default)
  • Three tools:
    • get_order_status(order_id) returns status, location, and eta
    • get_shipping_updates(order_id) returns shipping and delay updates
    • get_order_total(order_id) returns total order cost (for example $89.98)
  • Fallback routing when tool-calling is not used
  • Automatic retry without tool payload if current llama3 tag does not support tools
  • In-memory mock order database for a single customer account
  • Split frontend pages:
    • Chat page at GET /
    • Orders page at GET /orders
  • Ollama health endpoint and header status chip (online, model_missing, offline)
  • Chat loading indicator (Model is thinking...)

Screenshots

Chat UI screenshot

Orders page screenshot

Quick Start

Prerequisites

  1. Python 3.10+
  2. Ollama installed and running locally
  3. llama3 pulled locally
ollama pull llama3
ollama serve

Setup

python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt

Environment Variables

The app reads environment variables from system env and local .env files.

  • OLLAMA_BASE_URL default: http://localhost:11434
  • OLLAMA_MODEL default: llama3
  • MAX_MEMORY_MESSAGES default: 5

Use .env.example as your reference template.

Run

uvicorn app.main:app --reload

Endpoints

  • API docs: http://127.0.0.1:8000/docs
  • Chat UI: http://127.0.0.1:8000/
  • Orders UI: http://127.0.0.1:8000/orders
  • Orders API: GET /api/orders
  • Ollama Status API: GET /api/status/ollama

POST /chat Example

Request:

POST /chat
Content-Type: application/json

{
  "message": "Where is my order 123?"
}

Response:

{
  "response": "Your order 123 is currently in transit in Kansas City, MO and is expected to arrive tomorrow by 8 PM."
}

GET /api/status/ollama Example

{
  "connected": true,
  "model": "llama3",
  "model_available": true,
  "status": "ok",
  "detail": "Connected to Ollama and model 'llama3' is available."
}

Agent Flow

  1. User message is added to short-term memory.
  2. Agent sends system prompt + recent messages to Ollama with tool schemas.
  3. If Ollama reports tool support is unavailable, the agent retries without tools.
  4. If tools are requested, the agent executes them and sends results back to the model.
  5. If no tool call is made, fallback routing handles order, shipping, and pricing intents.
  6. Final assistant response is returned and stored in memory.

Project Structure

/project
  /app
    __init__.py
    main.py
    agent.py
    mock_data.py
    tools.py
    memory.py
    config.py
    /static
      index.html
      orders.html
      chat.js
      orders.js
      status.js
      styles.css
  .env.example
  requirements.txt
  README.md

About

FastAPI agentic RAG assistant for order tracking and shipping support.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors