A production-ready chat application template for Databricks with multi-user session management and a simple built-in UI.
This template provides enterprise-grade infrastructure out of the box:
- Multi-user session management with Lakebase persistence
- Environment-based configuration (no database config needed)
- Automated deployment to Databricks Apps with CLI tool
- Development scripts for local setup (
./start_app.sh,./stop_app.sh) - Lakebase integration with automatic schema creation
- Built-in chat UI - no frontend build required
- Clean wrapper around Databricks model serving endpoints (no LangChain complexity)
- Support for any chat model endpoint (Foundation Models or custom)
- Streaming responses ready
- Easy to customize and extend
- One-command setup for macOS (
./quickstart/setup.sh) - Automatic PostgreSQL setup for local development
- Comprehensive error handling and logging
- Type-safe API with Pydantic validation
- macOS (automated setup) or Linux/Windows (manual steps)
- Python 3.10+
- PostgreSQL 14+ (installed automatically on macOS)
- Databricks workspace with Apps enabled
# 1. Clone and enter the directory
cd databricks-chat-app
# 2. Copy environment file and configure
cp .env.example .env
# Edit .env with your DATABRICKS_HOST and DATABRICKS_TOKEN
# 3. Run automated setup
./quickstart/setup.sh
# 4. Start the application
./start_app.sh
# 5. Open http://localhost:8000
# 6. Stop when done
./stop_app.shThe setup script will:
- Install system dependencies (Homebrew, Python, PostgreSQL)
- Create Python virtual environment
- Initialize PostgreSQL database
# 1. Copy deployment configuration
cp config/deployment.example.yaml config/deployment.yaml
# 2. Edit deployment.yaml with your workspace details
# - Update workspace_path with your email
# - Update permissions with your email
# - Optionally change LLM_ENDPOINT
# 3. Deploy to development
./deploy.sh create --env development --profile my-profile
# Or update existing deployment
./deploy.sh update --env development --profile my-profile
# Validate before deploying
./deploy.sh create --env staging --profile my-profile --dry-runββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Built-in Chat UI (HTML/CSS/JS) β
β ββ Simple chat interface with session management β
ββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β REST API
ββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ
β Backend (FastAPI + Python) β
β ββ ChatModel: Direct model serving wrapper β
β ββ SessionManager: Multi-user session management β
ββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ
β Databricks Platform β
β ββ Model Serving Endpoint (LLM) β
β ββ Lakebase (session persistence) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Key Components:
- ChatModel (
src/services/chat_model.py): Clean wrapper for model serving - easy to customize - SessionManager (
src/api/services/session_manager.py): Database-backed session persistence - Settings (
src/core/settings.py): Environment-based configuration
databricks-chat-app/
βββ src/
β βββ api/ # FastAPI application
β β βββ routes/ # API endpoints
β β βββ schemas/ # Pydantic models
β β βββ services/ # Business logic (SessionManager)
β β βββ static/ # Built-in chat UI
β βββ core/ # Infrastructure (database, settings, clients)
β βββ database/models/ # SQLAlchemy ORM models (sessions only)
β βββ services/ # Chat model service
β βββ utils/ # Logging, error handling
βββ config/ # Deployment configuration
βββ scripts/ # Database initialization scripts
βββ quickstart/ # Setup automation
βββ db_app_deployment/ # Databricks deployment CLI
βββ tests/ # Unit and integration tests
All configuration is done via environment variables:
# Required for local development
DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
DATABRICKS_TOKEN=dapi...your-token
# Database (local development)
DATABASE_URL=postgresql://localhost:5432/chat_template
# LLM Configuration
LLM_ENDPOINT=databricks-claude-sonnet-4-5 # Model serving endpoint
LLM_TEMPERATURE=0.7 # Sampling temperature (0.0-2.0)
LLM_MAX_TOKENS=2048 # Maximum response tokens
# Optional
ENVIRONMENT=development
LOG_LEVEL=DEBUGEdit config/deployment.yaml to configure:
- App name and workspace path
- Permissions (who can access the app)
- Environment variables for the deployed app
- Lakebase database settings
Set the LLM_ENDPOINT environment variable in your deployment config:
env_vars:
LLM_ENDPOINT: "your-custom-endpoint-name"Available Databricks Foundation Model endpoints:
databricks-claude-sonnet-4-5databricks-meta-llama-3-1-70b-instructdatabricks-meta-llama-3-1-405b-instructdatabricks-dbrx-instructdatabricks-mixtral-8x7b-instruct
Set the SYSTEM_PROMPT environment variable, or edit the default in src/core/settings.py.
Edit src/services/chat_model.py to add custom logic:
- Pre-processing user messages
- Post-processing model responses
- Adding RAG integration
- Implementing function calling
# src/services/chat_model.py
async def generate_with_context(self, messages, documents):
"""Generate response with RAG context."""
# Add retrieved documents to context
context_msg = {
"role": "system",
"content": f"Context: {documents}"
}
messages.insert(1, context_msg)
return await self.generate(messages)Edit src/api/static/index.html to customize the chat interface. It's a single HTML file with embedded CSS and JavaScript - no build step required.
This template uses a simple synchronous request/response pattern. For most chat interactions with fast models, this works fine. However, if you're using:
- Slower models (e.g., 405B parameter models)
- Complex prompts with long responses
- RAG with large context retrieval
...you may hit the reverse proxy timeout.
For long-running LLM calls, implement a polling pattern where:
- Client submits a job and gets back a
job_idimmediately - Server processes the job in a background worker
- Client polls for status until complete
Here's a minimal example:
import asyncio
import uuid
from typing import Any, Dict
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
app = FastAPI()
# In-memory store (use database in production)
job_queue: asyncio.Queue = asyncio.Queue()
jobs: Dict[str, Dict[str, Any]] = {}
class ChatJobRequest(BaseModel):
message: str
session_id: str
# Background worker processes jobs from queue
async def worker():
while True:
job_id, payload = await job_queue.get()
jobs[job_id]["status"] = "running"
try:
# Your long-running LLM call here
result = await call_llm(payload["message"])
jobs[job_id]["status"] = "done"
jobs[job_id]["result"] = result
except Exception as exc:
jobs[job_id]["status"] = "error"
jobs[job_id]["error"] = str(exc)
finally:
job_queue.task_done()
# Start worker on app startup
@app.on_event("startup")
async def startup():
app.state.worker_task = asyncio.create_task(worker())
@app.on_event("shutdown")
async def shutdown():
app.state.worker_task.cancel()
# Submit job - returns immediately with job_id
@app.post("/api/chat/async")
async def submit_chat_job(req: ChatJobRequest):
job_id = uuid.uuid4().hex
jobs[job_id] = {"status": "pending", "result": None, "error": None}
await job_queue.put((job_id, req.model_dump()))
return {"job_id": job_id, "status": "pending"}
# Poll for job status/result
@app.get("/api/chat/jobs/{job_id}")
async def get_job_status(job_id: str):
meta = jobs.get(job_id)
if not meta:
raise HTTPException(status_code=404, detail="Job not found")
return {"job_id": job_id, **meta}Frontend polling:
async function sendMessageAsync(message, sessionId) {
// 1. Submit job
const submitRes = await fetch('/api/chat/async', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({message, session_id: sessionId})
});
const {job_id} = await submitRes.json();
// 2. Poll until complete
while (true) {
await new Promise(r => setTimeout(r, 1000)); // Wait 1 second
const statusRes = await fetch(`/api/chat/jobs/${job_id}`);
const job = await statusRes.json();
if (job.status === 'done') {
return job.result;
} else if (job.status === 'error') {
throw new Error(job.error);
}
// else: still pending/running, continue polling
}
}Note: This template includes database models for job tracking (ChatRequest table) that you can use instead of in-memory storage. See src/api/services/session_manager.py for create_chat_request(), update_chat_request_status(), and get_chat_request() methods.
# Backend (from project root)
source .venv/bin/activate
uvicorn src.api.main:app --reload --port 8000
# Database
python scripts/init_database.py # Initialize database tables
psql -d chat_template # Connect to local database
# Tests
pytest # All tests
pytest tests/unit/ # Unit tests onlyThe built-in UI supports testing multi-user behavior:
- Open the app in multiple browser tabs
- Each tab gets a unique session ID (shown in header)
- Use "New Session" to start fresh in any tab
- Watch different conversations happen in parallel
- Sessions persist in the database
| Issue | Solution |
|---|---|
DATABRICKS_HOST not set |
Create .env file with Databricks credentials |
Database connection failed |
Run ./quickstart/setup_database.sh |
Port already in use |
Run ./stop_app.sh to kill existing processes |
| Deployment fails | Run with --dry-run to validate configuration |
| Chat hangs | Check LLM_ENDPOINT is valid in your workspace |
Backend:
- Python 3.10+, FastAPI, SQLAlchemy
- Databricks SDK
- PostgreSQL (local) / Lakebase (production)
Frontend:
- Built-in HTML/CSS/JS (no build required)
Infrastructure:
- Databricks Apps, Lakebase, Model Serving
MIT License - feel free to use this template for your projects!
- β¨ Customize the chat model for your use case
- π§ Add RAG, function calling, or other AI features
- π¨ Customize the UI to match your branding
- π Deploy to Databricks and share with your team
Happy building! π