Skip to content

alam-cloud/sentinel-ai

Repository files navigation

🛡️ Sentinel AI - Service Mesh Threat Defense

AI-Powered Security Monitoring for Istio & Tetrate Service Mesh

License: MIT Node.js Tetrate Express WebSocket

Automated threat detection, policy generation, and real-time mesh security visualization

FeaturesQuick StartAPI DocsRed Team TestingContributing


📋 Table of Contents


🎯 Overview

Sentinel AI is an intelligent, AI-powered security monitoring platform designed specifically for Istio and Tetrate Service Mesh environments. It analyzes service mesh traffic logs in real-time, leverages Tetrate's AI Router (TARS) for advanced threat classification, and automatically generates production-ready security policies to protect your microservices infrastructure.

The Problem It Solves

Modern microservice architectures face constant security challenges that traditional tools struggle to address:

  • 🔐 Credential stuffing and brute force attacks targeting authentication services
  • 🔄 Lateral movement between services, exploiting trust relationships
  • 📤 Data exfiltration attempts through high-volume egress traffic
  • 🚫 Unauthorized access to admin endpoints and internal APIs
  • ⚠️ Service degradation and exploitation of vulnerable workloads
  • 🌐 Dynamic, ephemeral workloads that make static security policies ineffective

Traditional security tools are designed for monolithic applications and struggle with:

  • The dynamic nature of service mesh traffic (pods come and go)
  • High-volume telemetry that overwhelms manual analysis
  • Complex attack patterns that span multiple services
  • Policy management that requires deep Istio/Tetrate expertise

How It Works

Sentinel AI bridges this gap through a 5-stage security pipeline:

┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Ingest    │───▶│   Analyze   │───▶│   Detect    │───▶│ Visualize  │───▶│  Remediate  │
│             │    │             │    │             │    │             │    │             │
│ Istio logs  │    │ Tetrate AI  │    │ Threat      │    │ Mesh        │    │ Auto-gen    │
│ via REST    │    │ Router      │    │ Scoring     │    │ Topology    │    │ Policies    │
└─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘
  1. Ingest: Accept Istio-style telemetry logs via REST API (POST /api/tetrate/ingest)
  2. Analyze: Each event analyzed by Tetrate's AI Router (no direct OpenAI calls)
  3. Detect: Threats classified with MITRE ATT&CK taxonomy and confidence scoring (0-100%)
  4. Visualize: Live mesh topology updates with threat overlays (red nodes, dashed edges)
  5. Remediate: One-click generation of Istio AuthorizationPolicy YAML

Why Sentinel AI?

Feature Traditional Tools Sentinel AI
Threat Detection Rule-based, static patterns AI-powered, adaptive classification
Policy Generation Manual, requires expertise Automated, AI-generated YAML
Mesh Awareness Generic, not mesh-specific Built for Istio/Tetrate
Real-Time Batch processing, delays WebSocket-powered live updates
Cost Efficiency Fixed costs Smart model routing (cheap for triage)
Integration Standalone Native Tetrate TARS integration

✨ Key Features

🤖 AI-Powered Threat Detection

  • Tetrate TARS Integration: All AI analysis routed through Tetrate's AI Router (100% OpenAI-compatible, zero direct OpenAI calls)
  • Multi-Model Ready: Architecture supports cost-optimized model routing (cheap models for triage, expensive for critical)
  • MITRE ATT&CK Classification: Threat types include:
    • T1110 - Credential Stuffing
    • T1021.002 - Lateral Movement (SMB/Windows Admin Shares)
    • T1041 - Exfiltration Over C2 Channel
    • T1595 - Active Scanning
    • T1068 - Exploitation for Privilege Escalation
  • Confidence Scoring: 0-100% confidence levels with threshold-based auto-remediation (90%+ = auto-block)

🗺️ Live Service Mesh Topology

  • Dynamic Visualization: D3.js-powered interactive force-directed graph
  • Real-Time Updates: Topology evolves as you ingest logs (nodes/edges auto-discovered from workload names)
  • Threat Highlighting:
    • 🔴 Red nodes = Services under active threat
    • 🟢 Green nodes = Healthy services
    • 🟠 Orange nodes = Database/data stores
    • 🔵 Blue nodes = Gateway/edge services
  • Risky Traffic Indicators: Dashed orange lines = Non-mTLS or suspicious connections
  • Traffic Metrics: Node sizes reflect request volume

📊 Unified Security Console

  • Real-Time Dashboard: WebSocket-powered live updates (no page refresh needed)
  • Alert Management:
    • Manual alerts via form submission
    • Automated alerts from mesh log ingestion
    • Alert cards with severity badges (CRITICAL/HIGH/MEDIUM/LOW)
  • Threat Timeline: Chronological view of all detected threats with risk scores
  • Policy Management: View auto-applied and manual security policies
  • Activity Log: Live feed of all security events (success/warning/error)
  • Metrics Dashboard: Total alerts, high-severity count, anomaly score, policy count

🔧 Automated Policy Generation

  • One-Click YAML: Click ⚙️ gear icon on any alert card
  • AI-Generated: Policies tailored to the specific threat type and context
  • Production-Ready: Valid Kubernetes YAML, ready to kubectl apply
  • Best Practices: Follows Istio security recommendations and patterns
  • Example Output: Generates AuthorizationPolicy, PeerAuthentication, or RequestAuthentication as appropriate

🧪 Red Team Testing

  • Built-In Simulator: "Simulate Attack Traffic" button for quick tests
  • PowerShell Red Team Script: Invoke-SentinelRedTeam.ps1 with 5 MITRE ATT&CK scenarios
  • Multiple Attack Vectors:
    • Credential stuffing (20 batches × 50 attempts)
    • Lateral movement (admin API calls)
    • Data exfil (high-volume egress)
    • Scanning (15 probe requests)
    • Privilege escalation (service abuse)

🏗️ Architecture

High-Level Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                         Sentinel AI Architecture                         │
└─────────────────────────────────────────────────────────────────────────┘

    ┌──────────────────┐
    │   Istio Mesh     │
    │   (Telemetry)    │
    └────────┬─────────┘
             │
             │ POST /api/tetrate/ingest
             │ (Istio-style logs)
             ▼
    ┌─────────────────────────────────────────────────────────┐
    │              Sentinel AI Backend (Express.js)            │
    │  ┌───────────────────────────────────────────────────┐  │
    │  │  Log Ingestion & Normalization                      │  │
    │  │  • Validate & sanitize inputs                      │  │
    │  │  • Extract workload names, paths, codes             │  │
    │  └───────────────────────────────────────────────────┘  │
    │                          │                                │
    │                          ▼                                │
    │  ┌───────────────────────────────────────────────────┐  │
    │  │  AI Analysis Layer (Tetrate TARS)                 │  │
    │  │  • Threat classification                          │  │
    │  │  • Confidence scoring                             │  │
    │  │  • Policy recommendations                         │  │
    │  └───────────────────────────────────────────────────┘  │
    │                          │                                │
    │                          ▼                                │
    │  ┌───────────────────────────────────────────────────┐  │
    │  │  Threat Detection Engine                          │  │
    │  │  • MITRE ATT&CK mapping                           │  │
    │  │  • Auto-remediation logic                         │  │
    │  │  • Topology updates                               │  │
    │  └───────────────────────────────────────────────────┘  │
    └───────────────────────────┬──────────────────────────────┘
                                 │
                                 │ WebSocket /ws
                                 │ (Real-time events)
                                 ▼
    ┌─────────────────────────────────────────────────────────┐
    │           Unified Console (console.html + .js)          │
    │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  │
    │  │   Topology   │  │   Alerts     │  │   Policies  │  │
    │  │  (D3.js)     │  │   Manager    │  │   Manager   │  │
    │  └──────────────┘  └──────────────┘  └──────────────┘  │
    └─────────────────────────────────────────────────────────┘

Component Breakdown

Backend (server.js)

Core Services:

  • Express.js Server: REST API + WebSocket server
  • Log Ingestion: /api/tetrate/ingest - Accepts Istio-style logs, normalizes fields
  • Alert Analysis: /api/analyze - Manual alert submission with AI analysis
  • Policy Generation: /api/policies/recommend - AI-generated Istio YAML
  • Topology API: /api/topology - Live mesh state (nodes/edges)
  • WebSocket: /ws - Real-time event broadcasting to all connected clients

Data Stores (In-Memory):

  • alerts[] - Security alerts (max 100, FIFO)
  • policies[] - Applied security policies
  • topology - Live mesh topology (nodes + edges)
  • metrics - Aggregated metrics (total alerts, AI calls, errors)

Security Features:

  • Rate limiting (per-IP, configurable)
  • Optional API key authentication
  • Input validation and sanitization
  • Fail-secure AI error handling

Frontend (console.html + console.js)

Unified Dashboard:

  • Left Sidebar: Brand, metrics (anomaly score, threat count, policy count), navigation
  • Center Panel: Service mesh topology visualization (D3.js Force Simulation)
  • Right Panel:
    • Alert analysis form
    • Live activity log
    • Existing alerts list
    • Active mesh threats
    • Auto-applied policies
    • Attack simulator button

Real-Time Updates:

  • WebSocket client with auto-reconnect
  • Live alert rendering (fade-in animations)
  • Topology graph updates on threat detection
  • Metrics updates without page refresh

AI Layer (Tetrate TARS)

Integration:

  • OpenAI SDK compatible client
  • Base URL: https://api.router.tetrate.ai/v1
  • Model: gpt-4 (configurable)
  • No direct OpenAI/Gemini/Anthropic calls

AI Capabilities:

  • Threat classification (MITRE ATT&CK)
  • Confidence scoring (0-100%)
  • Policy generation (Istio YAML)
  • Natural language recommendations

🛠️ Tech Stack

Layer Technology Version Purpose
Runtime Node.js 18+ ES Modules support
Backend Framework Express.js 4.x REST API server
WebSocket ws 8.x Real-time communication
AI Provider Tetrate TARS Latest OpenAI-compatible router
AI SDK openai 4.x OpenAI SDK (used with Tetrate)
Frontend Vanilla JavaScript ES6+ No framework dependencies
Visualization D3.js v7 Force-directed graph
Security dotenv 16.x Environment variable management
HTTP Client node-fetch 3.x External API calls (if needed)
Testing PowerShell 5.1+ Red team attack scripts

Why This Stack?

  • Lightweight: No heavy frameworks, fast startup
  • Flexible: Easy to extend and customize
  • Production-Ready: Battle-tested libraries (Express, D3.js)
  • AI-Native: Built for Tetrate TARS from day one
  • Real-Time: WebSocket for instant updates

🚀 Quick Start

Prerequisites

  • Node.js 18+ installed (Download)
  • Tetrate TARS API Key (Get from Tetrate)
  • PowerShell 5.1+ (for red team scripts, Windows)
  • Modern Browser (Chrome, Firefox, Edge, Safari)

Installation

# Clone repository
git clone <your-repo-url>
cd sentienel-ai

# Install dependencies
npm install

# Copy environment template
cp .env.example .env

# Edit .env with your Tetrate API key (see Configuration below)

Configuration

Create .env file in project root:

# Tetrate TARS Configuration (REQUIRED)
TETRATE_API_KEY=sk-your-tetrate-api-key-here
TETRATE_BASE_URL=https://api.router.tetrate.ai/v1

# Optional: API key for external callers (leave empty to disable)
SENTINEL_API_KEY=your-strong-random-string-here

# Optional: Rate limiting (requests per minute per IP)
RATE_LIMIT_PER_MINUTE=60

# Optional: Server port (default: 3000)
PORT=3000

⚠️ Security Note: Never commit .env to git. Add it to .gitignore.

Run

npm start

Expected Output:

🛡️  Sentinel AI (Tetrate Edition) on port 3000
📊  Dashboard: http://localhost:3000/console.html
🔌  WebSocket: ws://localhost:3000/ws
🤖  AI Provider: Tetrate TARS (https://api.router.tetrate.ai)
🔍 Testing Tetrate TARS connection...
✅ Tetrate TARS Connected: X models available

Access the Console

Open your browser to: http://localhost:3000/console.html

You should see:

  • ✅ Service mesh topology graph (sample data)
  • ✅ Alert analysis form
  • ✅ Empty alerts/threats/policies lists (ready for data)

📖 Usage Guide

1. Manual Alert Analysis

Use Case: You have a security event you want analyzed (e.g., suspicious log entry, SIEM alert).

Steps:

  1. Open http://localhost:3000/console.html
  2. In the "Analyze New Alert" section:
    • Alert ID: Enter identifier (e.g., ALT-001) or leave empty for auto-generation
    • Severity: Select LOW, MEDIUM, HIGH, or CRITICAL
    • Alert Content: Describe the security event (be specific)
  3. Click "🔍 Analyze with AI"
  4. Wait 2-5 seconds for AI analysis
  5. View results:
    • Threat Type: MITRE ATT&CK classification
    • Confidence: 0-100% score
    • Recommendation: Specific action to take
    • Indicators: Evidence that led to classification
  6. Generate Policy: Click ⚙️ gear icon on alert card to get Istio YAML

Example Alert Content:

Security event: Multiple failed login attempts detected for user "admin" from IP 203.0.113.42.

- Service: auth-service
- Endpoint: /api/v1/login
- Failures: 187 attempts in 5 minutes
- Status codes: 401 (invalid credentials)
- User-Agent: "curl/8.0.1"
- GeoIP: RU

Suspicious: volume, single IP, automated user agent.

2. Ingest Mesh Logs

Use Case: You're collecting Istio telemetry and want Sentinel to analyze it automatically.

Method 1: curl

curl -X POST http://localhost:3000/api/tetrate/ingest \
  -H "Content-Type: application/json" \
  -d '{
    "logs": [
      {
        "source_workload": "attacker-pod",
        "destination_workload": "payment-service",
        "path": "/api/v1/admin/config",
        "response_code": 401,
        "request_method": "GET",
        "request_count": 1,
        "mtls": false,
        "destination_port": 80,
        "timestamp": "2026-02-17T19:00:00Z"
      }
    ],
    "cluster": "prod-mesh",
    "source": "istio-telemetry"
  }'

Method 2: PowerShell

$body = @{
    logs = @(
        @{
            source_workload = "attacker-pod"
            destination_workload = "payment-service"
            path = "/api/v1/admin/config"
            response_code = 401
            request_method = "GET"
            mtls = $false
            destination_port = 80
        }
    )
    cluster = "prod-mesh"
    source = "istio-telemetry"
} | ConvertTo-Json -Depth 5

Invoke-RestMethod -Uri "http://localhost:3000/api/tetrate/ingest" `
    -Method Post `
    -Body $body `
    -ContentType "application/json"

Method 3: Python

import requests

response = requests.post(
    "http://localhost:3000/api/tetrate/ingest",
    json={
        "logs": [{
            "source_workload": "attacker-pod",
            "destination_workload": "payment-service",
            "path": "/api/v1/admin/config",
            "response_code": 401,
            "request_method": "GET",
            "mtls": False,
            "destination_port": 80
        }],
        "cluster": "prod-mesh",
        "source": "istio-telemetry"
    }
)
print(response.json())

What Happens:

  1. Logs are validated and normalized
  2. Each log analyzed by Tetrate AI
  3. High-confidence threats (>75%) create alerts
  4. Mesh topology updated (new nodes/edges added)
  5. WebSocket broadcasts new-threat events
  6. Auto-remediation triggers for critical threats (90%+ confidence)

3. Red Team Testing

Use Case: Test Sentinel's detection capabilities with realistic attack scenarios.

Option A: Built-In Simulator

Click "Simulate Attack Traffic" button in console (sends one test attack).

Option B: PowerShell Red Team Script

cd scripts

# Run all attack scenarios
.\Invoke-SentinelRedTeam.ps1 -BaseUrl "http://localhost:3000" -Scenario All

# Run specific scenario
.\Invoke-SentinelRedTeam.ps1 -BaseUrl "http://localhost:3000" -Scenario CredentialStuffing
.\Invoke-SentinelRedTeam.ps1 -BaseUrl "http://localhost:3000" -Scenario LateralMovement
.\Invoke-SentinelRedTeam.ps1 -BaseUrl "http://localhost:3000" -Scenario DataExfil
.\Invoke-SentinelRedTeam.ps1 -BaseUrl "http://localhost:3000" -Scenario Scanning
.\Invoke-SentinelRedTeam.ps1 -BaseUrl "http://localhost:3000" -Scenario PrivEsc

Scenarios Explained:

Scenario MITRE ATT&CK What It Does Expected Detection
CredentialStuffing T1110 20 batches × 50 login attempts to auth-service Credential Stuffing, 90% confidence
LateralMovement T1021.002 reporting-service → admin APIs (/api/admin/users/export) Lateral Movement, 85% confidence
DataExfil T1041 High-volume egress (5000 requests) without mTLS Data Exfiltration, 90% confidence
Scanning T1595 15 probe requests with 404s to random paths Active Scanning, 75% confidence
PrivEsc T1068 10 × 500 errors to payment-service Service Abuse, 85% confidence

After Running:

  • Check Active Mesh Threats panel (should show 5+ threats)
  • Check Existing Alerts (should show multiple alerts)
  • Check Auto-Applied Policies (critical threats auto-blocked)
  • Check Mesh Topology (threatened services turn red)

4. Generate Security Policies

Use Case: You want Istio YAML to mitigate a detected threat.

Steps:

  1. Find an alert in the "Existing Alerts" list
  2. Click the ⚙️ gear icon on the alert card
  3. Wait 3-5 seconds for AI to generate policy
  4. Review the YAML in the popup
  5. Copy the YAML to a file (e.g., policy.yaml)
  6. Apply to your cluster: kubectl apply -f policy.yaml

Example Generated Policy:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: block-credential-stuffing
  namespace: default
spec:
  selector:
    matchLabels:
      app: auth-service
  action: DENY
  rules:
  - from:
    - source:
        ipBlocks: ["10.0.1.42/32"]
    to:
    - operation:
        paths: ["/api/v1/login"]

5. Monitor Live Topology

Use Case: Visualize your service mesh and see threats in real-time.

Features:

  • Drag nodes to reposition
  • Hover over nodes to see details
  • Color coding: Red = threatened, Green = healthy, Orange = database, Blue = gateway
  • Dashed lines: Non-mTLS or risky connections
  • Node size: Reflects traffic volume

Topology Updates:

  • New workloads appear automatically when logs reference them
  • Traffic counters increment as logs are ingested
  • Threat highlighting updates when threats are detected

🔌 API Reference

Base URL

All endpoints are relative to: http://localhost:3000 (or your configured PORT)

Authentication

Optional: Set SENTINEL_API_KEY in .env and include header:

x-api-key: your-api-key-here

If SENTINEL_API_KEY is not set, authentication is disabled (dev mode).

Rate Limiting

  • Default: 60 requests per minute per IP
  • Configurable: Set RATE_LIMIT_PER_MINUTE in .env
  • Applies to: /api/analyze, /api/tetrate/ingest, /api/policies/recommend
  • Response: 429 Too Many Requests if exceeded

POST /api/tetrate/ingest

Ingest Istio-style mesh logs for automated threat detection.

Endpoint: POST /api/tetrate/ingest

Headers:

Content-Type: application/json
x-api-key: <optional>

Request Body:

{
  "logs": [
    {
      "source_workload": "string (required)",
      "destination_workload": "string (required)",
      "path": "string (default: '/')",
      "response_code": "number (required)",
      "request_method": "string (default: 'GET')",
      "mtls": "boolean (default: true)",
      "destination_port": "number (default: 443)",
      "request_count": "number (default: 1)",
      "source_namespace": "string (optional)",
      "destination_namespace": "string (optional)",
      "protocol": "string (optional)",
      "timestamp": "string ISO8601 (optional)"
    }
  ],
  "cluster": "string (optional)",
  "source": "string (optional)"
}

Response (200 OK):

{
  "ingested": 1,
  "threatsDetected": 1,
  "source": "istio-telemetry"
}

Error Responses:

  • 400 Bad Request: Invalid logs format
  • 429 Too Many Requests: Rate limit exceeded
  • 500 Internal Server Error: Server error (check logs)

Example:

curl -X POST http://localhost:3000/api/tetrate/ingest \
  -H "Content-Type: application/json" \
  -d '{
    "logs": [{
      "source_workload": "frontend",
      "destination_workload": "api-gateway",
      "path": "/api/v1/users",
      "response_code": 200,
      "request_method": "GET",
      "mtls": true,
      "destination_port": 443
    }],
    "cluster": "prod-mesh",
    "source": "istio-telemetry"
  }'

POST /api/analyze

Manually submit an alert for AI analysis.

Endpoint: POST /api/analyze

Headers:

Content-Type: application/json

Request Body:

{
  "alertId": "string (optional, auto-generated if omitted)",
  "severity": "LOW|MEDIUM|HIGH|CRITICAL (required)",
  "content": "string (required, min 10 chars)"
}

Response (200 OK):

{
  "id": "ALT-1771355256312",
  "severity": "HIGH",
  "content": "Suspicious login from IP 192.168.1.100...",
  "analysis": {
    "threatType": "Credential Stuffing",
    "confidence": 85,
    "recommendation": "Block source IP immediately",
    "indicators": [
      "Multiple 401 responses",
      "High request volume",
      "Automated user agent"
    ],
    "provider": "Tetrate TARS",
    "model": "gpt-4-0613",
    "timestamp": "2026-02-17T19:00:00.000Z"
  },
  "timestamp": "7:00:00 PM",
  "createdAt": 1771355256312
}

Error Responses:

  • 400 Bad Request: Missing required fields or invalid severity
  • 429 Too Many Requests: Rate limit exceeded
  • 500 Internal Server Error: AI analysis failed

Example:

curl -X POST http://localhost:3000/api/analyze \
  -H "Content-Type: application/json" \
  -d '{
    "alertId": "ALT-001",
    "severity": "HIGH",
    "content": "Multiple failed login attempts from IP 203.0.113.42 to auth-service"
  }'

POST /api/policies/recommend

Generate Istio security policy YAML for a given alert.

Endpoint: POST /api/policies/recommend

Headers:

Content-Type: application/json

Request Body:

{
  "alertId": "string (required)"
}

Response (200 OK):

{
  "alertId": "ALT-001",
  "policy": "apiVersion: security.istio.io/v1beta1\nkind: AuthorizationPolicy\nmetadata:\n  name: block-threat-alt-001\n  namespace: default\nspec:\n  selector:\n    matchLabels:\n      app: payment-service\n  action: DENY\n  rules:\n  - from:\n    - source:\n        principals: [\"cluster.local/ns/default/sa/attacker-pod\"]\n    to:\n    - operation:\n        paths: [\"/api/v1/admin/config\"]",
  "model": "gpt-4-0613"
}

Error Responses:

  • 400 Bad Request: Missing alertId
  • 404 Not Found: Alert not found
  • 429 Too Many Requests: Rate limit exceeded
  • 500 Internal Server Error: Policy generation failed

Example:

curl -X POST http://localhost:3000/api/policies/recommend \
  -H "Content-Type: application/json" \
  -d '{
    "alertId": "ALT-001"
  }'

GET /api/topology

Get current live mesh topology state.

Endpoint: GET /api/topology

Response (200 OK):

{
  "nodes": [
    {
      "id": "frontend",
      "type": "service",
      "namespace": "web",
      "traffic": 4500
    },
    {
      "id": "api-gateway",
      "type": "gateway",
      "namespace": "istio-system",
      "traffic": 8900
    },
    {
      "id": "payment-service",
      "type": "critical",
      "namespace": "backend",
      "traffic": 1200
    }
  ],
  "edges": [
    {
      "source": "frontend",
      "target": "api-gateway",
      "protocol": "HTTP",
      "mtls": true
    },
    {
      "source": "api-gateway",
      "target": "payment-service",
      "protocol": "HTTPS",
      "mtls": true
    },
    {
      "source": "payment-service",
      "target": "database",
      "protocol": "TCP",
      "mtls": false
    }
  ]
}

Node Types:

  • service: Normal microservice
  • gateway: API gateway or ingress
  • database: Database or data store
  • critical: Service under threat (auto-set)

Edge Properties:

  • source: Source workload ID
  • target: Destination workload ID
  • protocol: HTTP, HTTPS, TCP, etc.
  • mtls: Boolean indicating mTLS status

GET /api/alerts

Get all active alerts.

Endpoint: GET /api/alerts

Response (200 OK):

[
  {
    "id": "ALT-001",
    "severity": "HIGH",
    "content": "...",
    "analysis": { ... },
    "timestamp": "7:00:00 PM",
    "createdAt": 1771355256312
  }
]

DELETE /api/alerts/:id

Delete a specific alert.

Endpoint: DELETE /api/alerts/:id

Response (200 OK):

{
  "success": true
}

WebSocket Event: alert-deleted broadcast to all clients


DELETE /api/alerts

Clear all alerts.

Endpoint: DELETE /api/alerts

Response (200 OK):

{
  "success": true
}

WebSocket Event: alerts-cleared broadcast to all clients


GET /api/metrics

Get aggregated metrics.

Endpoint: GET /api/metrics

Response (200 OK):

{
  "totalAlerts": 15,
  "highSeverity": 3,
  "lastActivity": "7:00:00 PM",
  "aiCalls": 42,
  "aiErrors": 0
}

GET /api/health

Health check endpoint.

Endpoint: GET /api/health

Response (200 OK):

{
  "status": "healthy",
  "provider": "Tetrate TARS",
  "clients": 2,
  "uptime": 3600.5,
  "aiCalls": 42,
  "aiErrors": 0
}

WebSocket /ws

Real-time event stream for live dashboard updates.

Connection:

const ws = new WebSocket('ws://localhost:3000/ws');

Message Format:

All messages are JSON strings.

Message Types:

init

Sent immediately after connection. Contains initial state.

{
  "type": "init",
  "alerts": [ ... ],
  "metrics": {
    "totalAlerts": 15,
    "highSeverity": 3,
    "lastActivity": "7:00:00 PM",
    "aiCalls": 42,
    "aiErrors": 0
  }
}

new-alert

Broadcast when a new alert is created.

{
  "type": "new-alert",
  "alert": {
    "id": "ALT-001",
    "severity": "HIGH",
    "content": "...",
    "analysis": { ... }
  }
}

new-threat

Broadcast when a mesh threat is detected.

{
  "type": "new-threat",
  "data": {
    "id": "MESH-1771357866278-oajs",
    "source": "attacker-pod",
    "destination": "payment-service",
    "type": "Unauthorized Access Attempt",
    "riskScore": 90,
    "evidence": "Pattern match detected",
    "recommendation": "Block source IP immediately",
    "autoRemediate": true
  },
  "metrics": {
    "anomalyScore": 45,
    "blockedRequests": 0,
    "mtlsViolations": 1
  }
}

policy-applied

Broadcast when a policy is auto-applied or manually created.

{
  "type": "policy-applied",
  "policy": {
    "id": "POL-1771357866278-xyz",
    "service": "payment-service",
    "action": "block",
    "reason": "Auto-blocked due to Unauthorized Access Attempt (confidence 90%)",
    "createdAt": "2026-02-17T19:00:00.000Z",
    "source": "auto-remediation"
  }
}

topology-update

Broadcast when mesh topology changes (new nodes/edges).

{
  "type": "topology-update",
  "topology": {
    "nodes": [ ... ],
    "edges": [ ... ]
  }
}

metrics

Broadcast when metrics update.

{
  "type": "metrics",
  "metrics": {
    "totalAlerts": 16,
    "highSeverity": 4,
    "lastActivity": "7:05:00 PM",
    "aiCalls": 43,
    "aiErrors": 0
  }
}

alert-deleted

Broadcast when an alert is deleted.

{
  "type": "alert-deleted",
  "id": "ALT-001"
}

alerts-cleared

Broadcast when all alerts are cleared.

{
  "type": "alerts-cleared"
}

Reconnection:

The client automatically reconnects on disconnect (max 5 attempts).


🧪 Red Team Testing

Overview

Sentinel AI includes comprehensive red team testing tools to validate detection capabilities. All scenarios are based on MITRE ATT&CK techniques and generate realistic attack traffic.

Built-In Simulator

Location: Unified Console → "Simulate Attack Traffic" button

What It Does:

  • Sends one test attack log to /api/tetrate/ingest
  • Simulates attacker-podpayment-service unauthorized access
  • Response code: 401 (unauthorized)
  • Non-mTLS connection

Use Case: Quick smoke test to verify Sentinel is working.

PowerShell Red Team Script

Location: scripts/Invoke-SentinelRedTeam.ps1

Prerequisites:

  • PowerShell 5.1+ (Windows)
  • Sentinel AI server running

Usage:

# Navigate to scripts directory
cd scripts

# Run all scenarios
.\Invoke-SentinelRedTeam.ps1 -BaseUrl "http://localhost:3000" -Scenario All

# Run specific scenario
.\Invoke-SentinelRedTeam.ps1 -BaseUrl "http://localhost:3000" -Scenario CredentialStuffing
.\Invoke-SentinelRedTeam.ps1 -BaseUrl "http://localhost:3000" -Scenario LateralMovement
.\Invoke-SentinelRedTeam.ps1 -BaseUrl "http://localhost:3000" -Scenario DataExfil
.\Invoke-SentinelRedTeam.ps1 -BaseUrl "http://localhost:3000" -Scenario Scanning
.\Invoke-SentinelRedTeam.ps1 -BaseUrl "http://localhost:3000" -Scenario PrivEsc

Scenarios:

1. Credential Stuffing (T1110)

MITRE ATT&CK: T1110 - Brute Force

What It Sends:

  • 20 log entries
  • Each entry: 50 login attempts (request_count: 50)
  • Source: 10.0.1.42 (IP address)
  • Destination: auth-service
  • Path: /api/v1/login
  • Response code: 401 (unauthorized)
  • Non-mTLS connection

Expected Detection:

  • Threat Type: Credential Stuffing
  • Confidence: 85-90%
  • Recommendation: "Block source IP immediately"

Why It Works:

  • High volume of failed authentication attempts
  • Single source IP
  • Pattern matches credential stuffing behavior

2. Lateral Movement (T1021.002)

MITRE ATT&CK: T1021.002 - SMB/Windows Admin Shares

What It Sends:

  • 2 log entries
  • Source: reporting-service (internal service)
  • Destinations:
    • user-service/api/admin/users/export
    • payment-service/api/internal/config
  • Response code: 200 (success)
  • mTLS: enabled

Expected Detection:

  • Threat Type: Lateral Movement
  • Confidence: 85-90%
  • Recommendation: "Investigate service-to-service admin access"

Why It Works:

  • Internal service accessing admin endpoints
  • Unusual access pattern (reporting-service shouldn't access admin APIs)
  • Potential data exfiltration vector

3. Data Exfiltration (T1041)

MITRE ATT&CK: T1041 - Exfiltration Over C2 Channel

What It Sends:

  • 1 log entry
  • Source: database
  • Destination: external-proxy
  • Path: /upload/bulk
  • Response code: 200 (success)
  • Request count: 5000 (high volume)
  • Non-mTLS connection

Expected Detection:

  • Threat Type: Data Exfiltration
  • Confidence: 90%+
  • Recommendation: "Block egress to external-proxy immediately"

Why It Works:

  • High-volume egress traffic
  • No mTLS (security violation)
  • Database → external destination (suspicious)

4. Active Scanning (T1595)

MITRE ATT&CK: T1595 - Active Scanning

What It Sends:

  • 15 log entries
  • Source: scanner-pod
  • Destination: api-gateway
  • Paths: Random paths like /api/v1/probe/12345, /api/v1/probe/67890, etc.
  • Response code: 404 (not found)
  • mTLS: enabled

Expected Detection:

  • Threat Type: Active Scanning or Reconnaissance
  • Confidence: 75-85%
  • Recommendation: "Rate limit scanner-pod or investigate"

Why It Works:

  • Many 404 responses (probing for endpoints)
  • Single source, multiple random paths
  • Pattern matches scanning behavior

5. Privilege Escalation (T1068)

MITRE ATT&CK: T1068 - Exploitation for Privilege Escalation

What It Sends:

  • 10 log entries
  • Source: frontend
  • Destination: payment-service (critical service)
  • Path: /api/v1/orders
  • Response code: 500 (server error)
  • mTLS: enabled

Expected Detection:

  • Threat Type: Service Abuse or Privilege Escalation
  • Confidence: 85-90%
  • Recommendation: "Investigate payment-service errors immediately"

Why It Works:

  • Multiple 500 errors (service degradation)
  • Critical service (payment-service)
  • Potential exploitation attempt

Manual Alert Test

The script also sends one direct alert to /api/analyze:

Content:

Red team test: suspicious PowerShell execution from workstation DESKTOP-XX to jump server. Process: powershell.exe -EncodedCommand. Parent: cmd.exe.

Severity: HIGH

Expected Detection:

  • Threat Type: Command and Control or Execution
  • Confidence: 80-90%

Expected Results

After running -Scenario All, you should see:

  1. Active Mesh Threats: 5+ threats listed
  2. Existing Alerts: 5+ alerts (one per scenario + manual alert)
  3. Auto-Applied Policies: 2-3 policies (for 90%+ confidence threats)
  4. Mesh Topology:
    • New nodes: scanner-pod, external-proxy, reporting-service
    • Red nodes: auth-service, payment-service (under threat)
    • Dashed edges: Non-mTLS connections highlighted

Customizing Scenarios

Edit scripts/Invoke-SentinelRedTeam.ps1 to:

  • Change workload names
  • Adjust request counts
  • Modify response codes
  • Add new scenarios

📸 Screenshots & Demo

Unified Console

The unified console provides a comprehensive view of your service mesh security:

Left Sidebar:

  • Branding and navigation
  • Mesh Anomaly Score: 0-100 (higher = more threats)
  • Active Threats: Count of detected threats
  • Policies Active: Count of applied policies

Center Panel:

  • Service Mesh Topology: Interactive D3.js graph
    • Drag nodes to reposition
    • Color-coded by service type and threat status
    • Dashed lines indicate risky connections
  • Status Pill: "Mesh Healthy" / "Warning" / "Critical"
  • Connection Status: WebSocket connection indicator

Right Panel:

  • Analyze New Alert: Form for manual alert submission
  • Live Activity: Real-time event log
  • Existing Alerts: List of all alerts with AI analysis
  • Active Mesh Threats: Detected threats from log ingestion
  • Auto-Applied Policies: Security policies generated automatically

Threat Detection Example

Scenario: Credential stuffing attack detected

Console Shows:

  • Alert Card:
    • ID: MESH-1771357866278-oajs
    • Severity: HIGH
    • Threat: Credential Stuffing
    • Confidence: 90%
    • Recommendation: "Block source IP immediately"
  • Mesh Topology:
    • auth-service node turns red
    • Edge from 10.0.1.42auth-service shown as dashed orange
  • Active Mesh Threats:
    • Threat card with risk score 90%
    • Source: 10.0.1.42 → Destination: auth-service
  • Auto-Applied Policies:
    • Policy: BLOCK on auth-service
    • Reason: "Auto-blocked due to Credential Stuffing (confidence 90%)"

🔒 Security Considerations

Production-Ready Checklist

✅ Implemented

  • Secrets Management: Environment variables (.env), no hard-coded keys
  • Rate Limiting: Per-IP limits on AI-heavy endpoints (60 req/min default, configurable)
  • Input Validation: Logs normalized and validated before processing
  • AI Routing: All AI calls via Tetrate TARS (no direct OpenAI/Gemini/Anthropic)
  • Fail-Secure: AI failures trigger high-severity alerts requiring manual review
  • Body Size Limits: 1MB max request body size
  • Error Handling: Graceful degradation on errors

⚠️ Not Yet Implemented (Recommended for Production)

  • Authentication/Authorization: API key auth exists but disabled by default
  • Persistent Storage: Alerts/policies in-memory only (lost on restart)
  • TLS Termination: Use reverse proxy/ingress (nginx, Istio Gateway)
  • Audit Logging: Log to external system (ELK, Splunk)
  • Multi-Tenant Isolation: Single-tenant only
  • Database Integration: PostgreSQL/MongoDB for persistence
  • RBAC: Role-based access control
  • Encryption at Rest: If adding database

Security Recommendations

1. Deploy Behind Istio Ingress

Use Istio's own security features:

apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: sentinel-gateway
spec:
  selector:
    istio: ingressgateway
  servers:
  - port:
      number: 443
      name: https
      protocol: HTTPS
    tls:
      mode: SIMPLE
    hosts:
    - sentinel.yourdomain.com

2. Enable API Key Authentication

Set SENTINEL_API_KEY in production:

SENTINEL_API_KEY=your-strong-random-string-min-32-chars

Then include header in requests:

curl -H "x-api-key: your-strong-random-string-min-32-chars" \
  http://localhost:3000/api/analyze

3. Use Kubernetes Secrets

Store Tetrate API key as K8s secret:

kubectl create secret generic sentinel-secrets \
  --from-literal=tetrate-api-key='sk-your-key'

Reference in deployment:

env:
- name: TETRATE_API_KEY
  valueFrom:
    secretKeyRef:
      name: sentinel-secrets
      key: tetrate-api-key

4. Monitor Costs

Track Tetrate TARS token usage:

  • Check GET /api/metricsaiCalls field
  • Monitor Tetrate dashboard for usage
  • Consider cost-optimized model routing (cheap models for triage)

5. Network Security

  • Firewall Rules: Only allow ingress from trusted sources
  • VPN/Private Network: Deploy in private network, access via VPN
  • mTLS: Use Istio mTLS for service-to-service communication

6. Data Retention

Currently in-memory only. For production:

  • Add PostgreSQL/MongoDB for persistence
  • Implement data retention policies (e.g., delete alerts older than 90 days)
  • Archive critical alerts to long-term storage

⚡ Performance & Scaling

Current Limitations

  • In-Memory Storage: Alerts/policies lost on restart
  • Single Instance: No horizontal scaling support
  • No Caching: Every AI call hits Tetrate TARS
  • No Queue: Synchronous processing (blocks on AI calls)

Performance Metrics

Tested On:

  • Node.js 18+ on Windows/Linux
  • 8GB RAM, 4 CPU cores

Observed Performance:

  • Alert Analysis: 2-5 seconds per alert (AI call latency)
  • Log Ingestion: ~100 logs/second (single-threaded)
  • WebSocket Latency: <100ms for event broadcasting
  • Topology Rendering: Smooth 60fps with <50 nodes

Scaling Recommendations

Short-Term (Current Architecture)

  1. Increase Rate Limits: Adjust RATE_LIMIT_PER_MINUTE if needed
  2. Batch Processing: Send multiple logs in one /api/tetrate/ingest call
  3. Async Processing: Consider background jobs for AI calls (future)

Long-Term (Production Architecture)

  1. Database: PostgreSQL for persistence and querying
  2. Message Queue: Redis/RabbitMQ for async AI processing
  3. Horizontal Scaling: Multiple instances behind load balancer
  4. Caching: Redis cache for AI responses (similar threats)
  5. CDN: Serve static assets (console.html, console.js) via CDN

Cost Optimization

Tetrate TARS Model Selection:

Currently uses gpt-4 for all requests. For cost optimization:

  1. Triage with Cheap Model: Use gpt-4-turbo or claude-haiku for LOW/MEDIUM severity
  2. Escalate to Expensive: Use gpt-4 only for HIGH/CRITICAL or low-confidence cases
  3. Caching: Cache AI responses for similar threats (future)

Example Cost Savings:

Approach Cost per 1000 Alerts Savings
Always gpt-4 ~$50 Baseline
Cheap model (80%) + gpt-4 (20%) ~$18 64% savings

🔧 Troubleshooting

Common Issues

1. "Tetrate TARS Connection Failed"

Symptoms:

  • Console shows "AI Router Unavailable"
  • Alerts return fallback responses

Solutions:

  • ✅ Check .env file has correct TETRATE_API_KEY
  • ✅ Verify API key is valid (test at Tetrate dashboard)
  • ✅ Check network connectivity to https://api.router.tetrate.ai
  • ✅ Verify TETRATE_BASE_URL is correct

Test Connection:

curl https://api.router.tetrate.ai/v1/models \
  -H "Authorization: Bearer $TETRATE_API_KEY"

2. "Port 3000 Already in Use"

Symptoms:

Error: listen EADDRINUSE: address already in use :::3000

Solutions:

Option A: Kill existing process

# Find process using port 3000
netstat -ano | findstr :3000

# Kill it (replace PID with actual process ID)
taskkill /PID <PID> /F

Option B: Use different port

$env:PORT=3001
npm start

Then access: http://localhost:3001/console.html

3. "Failed to Generate Policy"

Symptoms:

  • Clicking ⚙️ shows "Failed to generate policy"
  • Console shows error

Solutions:

  • ✅ Check alert exists (refresh page, check alert ID)
  • ✅ Verify Tetrate TARS connection is working
  • ✅ Check server logs for detailed error
  • ✅ Ensure alert has analysis field (AI must have analyzed it first)

4. WebSocket Disconnects Frequently

Symptoms:

  • Console shows "Disconnected" frequently
  • Alerts not updating in real-time

Solutions:

  • ✅ Check network stability
  • ✅ Verify firewall allows WebSocket connections
  • ✅ Check server logs for WebSocket errors
  • ✅ Increase browser timeout (if using proxy)

5. Topology Not Updating

Symptoms:

  • Graph shows sample data only
  • New workloads not appearing

Solutions:

  • ✅ Verify logs include source_workload and destination_workload
  • ✅ Check WebSocket connection is active
  • ✅ Refresh page to reload topology from /api/topology
  • ✅ Check browser console for JavaScript errors

6. Rate Limit Errors

Symptoms:

429 Too Many Requests
Rate limit exceeded. Please slow down.

Solutions:

  • ✅ Wait 1 minute before retrying
  • ✅ Increase RATE_LIMIT_PER_MINUTE in .env
  • ✅ Batch requests (send multiple logs in one call)
  • ✅ Use different IP address (if testing from multiple sources)

Debug Mode

Enable verbose logging:

# Set environment variable
$env:DEBUG=1
npm start

Or add to .env:

DEBUG=1

Getting Help

  1. Check Logs: Server console shows detailed error messages
  2. Browser Console: Open DevTools (F12) → Console tab
  3. Network Tab: Check API requests/responses
  4. GitHub Issues: Open an issue with:
    • Error message
    • Steps to reproduce
    • Server logs
    • Browser console errors

🗺️ Roadmap

Phase 1: Core Features ✅ (Completed)

  • AI-powered threat detection via Tetrate TARS
  • Unified console (alerts + mesh topology)
  • Automated policy generation
  • Live topology updates from logs
  • Red team testing scripts
  • WebSocket real-time updates
  • Rate limiting and input validation

Phase 2: Production Hardening (In Progress)

  • Persistent Storage: PostgreSQL integration for alerts/policies
  • Authentication: OIDC/JWT support for multi-user access
  • Multi-Model Routing: Cost-optimized model selection
  • Kubernetes Operator: Deploy via K8s operator
  • Prometheus Metrics: Export metrics for monitoring
  • Health Checks: Liveness/readiness probes
  • Config Management: YAML-based configuration
  • Logging: Structured logging (Winston/Pino)

Phase 3: Advanced Features

  • Tetrate Service Bridge (TSB) Integration: Native TSB telemetry
  • Custom Detection Rules: YAML-based rule engine
  • Policy Enforcement: Auto-apply policies to cluster (kubectl apply)
  • Historical Analysis: Threat trends over time
  • Multi-Cluster Support: Aggregate threats across clusters
  • Alert Correlation: Group related threats
  • Threat Intelligence: Integration with threat feeds
  • Compliance Reporting: SOC 2, PCI-DSS reports

Phase 4: Enterprise

  • RBAC: Role-based access control
  • Multi-Tenancy: Isolated workspaces per tenant
  • SIEM Integration: Splunk, ELK, QRadar connectors
  • Custom AI Models: Fine-tune models for your environment
  • SLA Monitoring: Uptime, response time tracking
  • Advanced Analytics: ML-based anomaly detection
  • API Gateway: GraphQL API option
  • Mobile App: iOS/Android companion app

Future Ideas

  • Threat Hunting: Interactive query builder for threat hunting
  • Playbooks: Automated response playbooks (SOAR-like)
  • Threat Simulation: Generate attack scenarios for training
  • Compliance Automation: Auto-generate compliance reports
  • Integration Marketplace: Pre-built integrations (Slack, PagerDuty, etc.)

🤝 Contributing

We welcome contributions! Here's how to get started.

Areas We Need Help

  1. Additional Attack Scenarios: More MITRE ATT&CK techniques
  2. UI Enhancements: Better visualization, filtering, search
  3. Backend Improvements: Database integration, caching, performance
  4. Documentation: Tutorials, video demos, blog posts
  5. Testing: Unit tests, integration tests, E2E tests
  6. Localization: Multi-language support
  7. Accessibility: WCAG compliance improvements

Development Setup

# Clone repository
git clone <your-repo-url>
cd sentienel-ai

# Install dependencies
npm install

# Copy environment template
cp .env.example .env

# Edit .env with your Tetrate API key
# (Get from https://tetrate.io)

# Start development server
npm start

# Open browser
# http://localhost:3000/console.html

Code Style

  • JavaScript: ES6+ modules, async/await preferred
  • Naming: camelCase for variables, PascalCase for classes
  • Comments: JSDoc for functions
  • Formatting: Prettier (if configured)

Submitting Changes

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Pull Request Guidelines

  • Description: Clear description of changes
  • Testing: How you tested the changes
  • Screenshots: If UI changes
  • Breaking Changes: Document any breaking changes

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

MIT License Summary:

  • ✅ Commercial use allowed
  • ✅ Modification allowed
  • ✅ Distribution allowed
  • ✅ Private use allowed
  • ⚠️ License and copyright notice required

🙏 Acknowledgments

Core Technologies

  • Tetrate - For the AI Router (TARS) and service mesh expertise
  • Istio - For the service mesh foundation and security policies
  • D3.js - For the powerful visualization library
  • OpenAI - For the API design that Tetrate TARS implements
  • Express.js - For the robust web framework
  • Node.js - For the JavaScript runtime

Inspiration

  • Kiali - Service mesh observability UI
  • Tetrate Service Bridge - Enterprise service mesh platform
  • Falco - Runtime security monitoring
  • MITRE ATT&CK - Threat classification framework

Community

  • DevSecOps community for feedback and ideas
  • Early adopters and beta testers
  • Open source contributors

📞 Support & Resources

Getting Help

Tetrate Support

For issues with Tetrate TARS API:

Community

  • Discord: (if you create one)
  • Twitter: @SentinelAI (if you create one)
  • Blog: (if you create one)

📊 Project Status

Current Version: 1.0.0

Status: ✅ Production-Ready (with caveats - see Security Considerations)

Last Updated: February 2026

Maintainer: [Your Name/Organization]


🛡️ Sentinel AI - Protecting Your Service Mesh, One Threat at a Time

Built with ❤️ for the DevSecOps community

⬆ Back to Top


Star ⭐ this repo if you find it useful!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors