AI Guard

Behavioral multi-turn injection detection for LLM applications.

AI Guard models adversarial escalation across user turns instead of scanning a single prompt blob. It detects escalation drift, oscillation patterns, threshold gaming attempts, and computes a composite behavioral risk score (0–100).

Designed for:

Designed For

LLM SaaS backends
Prompt-based AI tools
CI security enforcement
Developers building AI workflows

Why AI Guard?

Most prompt injection tools scan a single prompt blob.

AI Guard models:

User-turn escalation behavior
Drift progression over time
Oscillation patterns
Late detection risk
Threshold gaming attempts
Composite adversarial risk scoring

It is behavioral, not just pattern-based.

Core Capabilities

Transcript-aware analysis
User-turn behavioral modeling
Drift detection
Oscillation detection
Threshold gaming detection
Composite behavioral risk score (0–100)
CI threshold enforcement

Installation

Build from Source

git clone https://github.com/CODESHUB-O/ai-guard.git
cd ai-guard
cargo build --release

Binary will be located at:

target/release/ai_guard

Quick Start

Analyze a transcript:

./target/release/ai_guard multiturn examples/sample.txt --threshold 95

If the composite risk score is greater than or equal to the threshold,
the process exits with code 1 (CI failure).

Example Output

AI Guard — Behavioral Risk Analysis
------------------------------------
User Turns           : 4
Peak Severity        : Critical
Final Severity       : Critical
Drift Detected       : true
Partial Suppression  : false
Oscillation Detected : false
Late Detection       : false
Threshold Gaming     : false
Composite Risk Score : 72
Risk Level           : High
CI Threshold         : 95

JSON Output (CI Friendly)

./target/release/ai_guard multiturn transcript.txt --threshold 95 --json

Example:

{
  "composite_risk": 72,
  "risk_level": "High"
}

GitHub Action Example

name: AI Guard Scan

on:
  pull_request:
  push:
    branches:
      - main

jobs:
  ai-guard:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
      - run: cargo build --release
      - run: ./target/release/ai_guard multiturn examples/sample.txt --threshold 95

This will fail CI if behavioral risk exceeds the defined threshold.

Composite Risk Model (0–100)

Risk score is calculated using:

Peak severity weight
Detection delay
Escalation velocity
Drift presence
Suppression patterns
Oscillation detection
Threshold gaming behavior

This enables behavioral adversarial modeling rather than static pattern checks.

Use Cases

Secure LLM SaaS backends
Validate prompt pipelines in CI
Detect adversarial multi-turn escalation
Monitor injection risk during development
Guard AI tool execution workflows

Roadmap

Upcoming Pro Features:

Repository-wide batch scanning
Risk trend comparison across commits
Historical baseline tracking
Advanced reporting export
Custom scoring configuration

License

MIT License

Philosophy

AI Guard focuses on behavioral modeling rather than surface-level string matching.

The goal is to detect adversarial progression patterns — not just isolated keywords.

Security is not static. Behavior matters.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
examples		examples
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Guard

Designed For

Why AI Guard?

Core Capabilities

Installation

Build from Source

Quick Start

Example Output

JSON Output (CI Friendly)

GitHub Action Example

Composite Risk Model (0–100)

Use Cases

Roadmap

License

Philosophy

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Guard

Designed For

Why AI Guard?

Core Capabilities

Installation

Build from Source

Quick Start

Example Output

JSON Output (CI Friendly)

GitHub Action Example

Composite Risk Model (0–100)

Use Cases

Roadmap

License

Philosophy

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages