Xana AI is an intelligent chatbot platform designed for shop-floor operators and technicians to interact with industrial machines. Built with Next.js (frontend) and NestJS (backend), it leverages RAG (Retrieval-Augmented Generation), vector embeddings, and LLM-powered conversational AI to provide contextual machine support, live data visualization, and alert monitoring.
- Conversational Chat Interface: Interactive chat UI with markdown support and syntax highlighting
- Multi-Asset Selection: Choose specific machines or query across all available assets
- Live Data Visualization: Real-time chart rendering using Chart.js for time-series metrics
- Alert Monitoring: Display machine alerts with severity, status, and timestamps
- Theme Support: Dark and light mode toggle for user preference
- Authentication: Token-based authentication integrated with IFF (IndustryFusion) suite
- Responsive Design: Built with Tailwind CSS and Radix UI components
- RAG-Powered Query Service: Semantic search using Milvus vector database with BGE-M3 embeddings
- LLM Integration: Meta LLaMA 3.3 70B Instruct model via IONOS Cloud API & Qwen2.5-14B-Instruct-fp16-ov via OpenVINO model server running on Intel dGPU like Battlemage or on CPU
- Intent Detection: Automatically detects chart and alert requests using structured LLM outputs
- Live Data Fetching: PostgreSQL TimescaleDB integration for historical machine metrics
- Alert Integration: Real-time alert retrieval from Alerta API
- Vector Store Management: MongoDB-based asset-to-vector-store mapping
- Security: JWT token handling with encryption/masking for sensitive data
- CORS & API Gateway: Configurable CORS and REST endpoints
XanaAI/
βββ backend/ # NestJS REST API
β βββ src/
β β βββ endpoints/
β β β βββ query/ # Main query service with RAG
β β β βββ ionos-rest/ # LLM & embedding API client IONOS
| | | βββ opea-rest # LLM & embedding API using OpenVINO server running on Intel
| | | βββ ollama-rest # LLM & embedding API client using Ollama running on Intel
β β β βββ vector_mapping/ # Asset-to-vector store mapping
β β βββ data/jsonld/ # JSON-LD machine schemas
β β βββ main.ts # App entry (port 4050)
β βββ package.json
β
βββ frontend/ # Next.js application
β βββ src/
β β βββ app/
β β β βββ page.tsx # Main chat interface
β β βββ components/
β β β βββ PromptBox.tsx # User input component
β β β βββ AlertSummaryBlock.tsx
β β βββ utility/tools.ts # Helper functions
β βββ package.json
β
βββ README.md
- Framework: NestJS (Node.js)
- LLM: Meta LLaMA 3.3 70B Instruct (via IONOS Cloud) OR Qwen2.5-14B-Instruct-fp16-ov via OpenVINO model server running on Intel dGPU like Battlemage or 0n CPU
- Embeddings: BAAI/bge-m3 (1024-dim vectors)
- Vector DB: Milvus (semantic search)
- Time-Series DB: PostgreSQL/TimescaleDB
- Alert System: Alerta API
- Metadata Store: MongoDB
- Authentication: JWT with JOSE encryption
- Framework: Next.js 15 (React 18)
- Styling: Tailwind CSS 4, Radix UI
- Charts: Chart.js, PrimeReact
- Markdown: react-markdown with remark-gfm
- HTTP Client: Axios
- Node.js 20+
- PostgreSQL (TimescaleDB)
- MongoDB
- Milvus vector database
- Alerta instance (optional, for alerts)
-
Navigate to backend directory
cd backend -
Install dependencies
npm install
-
Configure environment variables looking .env.example
Create a.envfile:# API Keys COMPLETIONS_API_KEY=your_ionos_api_key COMPLETIONS_API_URL=https://inference.de-txl.ionos.com #OPEA OVMS Configuration (when LLM_PROVIDER="opea-ovms") OPEA_LLM_URL=http://localhost:8000/v3/chat/completions OPEA_LLM_MODEL=Qwen2.5-14B-Instruct-fp16-ov OPEA_CHAT_TIMEOUT=1800000 # 30 minutes # PostgreSQL (TimescaleDB) PGHOST=your_postgres_host PGPORT=5432 PGPASSWORD=your_password PG_TABLE=entityhistory PGSSL=true # MongoDB MONGODB_URI=mongodb://localhost:27017 MONGODB_DB=admin MONGODB_COL=vector_store_mappings # Milvus MILVUS_COLLECTION_NAME=custom_setup_6 RAG_EMBED_DIM=1024 # Alerta ALERTA_API_URL=https://alerta.example.com/api/alerts ALERTA_API_KEY=your_alerta_key # Security SECRET_KEY=your_jwt_secret MASK_SECRET=your_mask_secret REGISTRY_URL=https://registry.example.com # CORS CORS_ORIGIN=http://localhost:3050
-
Start development server
npm run start:dev
- Backend runs on
http://localhost:4050
- Backend runs on
-
Navigate to frontend directory
cd frontend -
Install dependencies
npm install
-
Configure environment variables
Create a.env.localfile:NEXT_PUBLIC_API_BASE=http://localhost:4050
-
Start development server
npm run dev
- Frontend runs on
http://localhost:3050
- Frontend runs on
In frontend/src/app/page.tsx (line ~70), change:
setLogin(false) β setLogin(true)Backend Routes:
POST /query- Main chat query with RAGGET /vector-mappings- List available assetsPOST /auth/get-indexed-db-data- Retrieve indexed user dataPOST /ai/chat- Direct LLM completion (for testing)
Frontend Flow:
- User authenticates via IFF token (URL param)
- Loads available machines from
/vector-mappings - Sends messages to
/querywith selected assets - Displays LLM response, charts, and alerts
- User Query β Sent to backend with conversation history
- Intent Detection β LLM determines if chart/alert data is needed
- Vector Search β User question embedded β Milvus retrieves relevant docs
- Context Injection β Search results added to system prompt
- LLM Response β LLaMA generates answer using machine docs + context
- Live Data β If chart/alert intent detected, fetches from Postgres/Alerta
- Frontend Rendering β Displays text + charts + alerts
npm run build
npm run start:prodnpm run build
npm run startDockerfiles are included in both backend/ and frontend/ directories.
This project is licensed under the terms specified in the LICENSE file.
Developed and maintained by IndustryFusion.
For issues or feature requests, please contact the development team.