agent-safety

Here are 9 public repositories matching this topic...

SafellmHub / hguard-go

Guardrails for LLMs: detect and block hallucinated tool calls to improve safety and reliability.

middleware machine-learning ai language-models ai-safety prompt-engineering llms toolformer hallucination-detection tool-calling agent-safety

Updated Jul 18, 2025
Go

corv89 / shannot

Star

Human-in-the-loop execution for LLM agents

python linux cli security devops automation mcp sysadmin python3 developer-tools human-in-the-loop llm llm-agents agent-safety supervised-execution

Updated Dec 27, 2025
Python

Pro-GenAI / Agent-Action-Guard

Star

🛡️ Safe AI Agents through Action Classifier

Updated Dec 16, 2025
Python

aerosta / rewardhackwatch

Star

Runtime detector for reward hacking and misalignment in LLM agents (89.7% F1 on 5,391 trajectories).

nlp machine-learning monitoring deep-learning transformers pytorch alignment ai-safety fastapi huggingface streamlit distilbert llm rlhf llm-agents agent-safety reward-hacking misalignment

Updated Dec 11, 2025
Python

Maxbanker / negentropy-constellation

Star

Safety-first agentic toolkit: 10 packages for collapse detection, governance, and reproducible runs.

benchmark time-series simulation reliability observability governance ethics anomaly-detection mlops agent-safety

Updated Dec 9, 2025
Python

KarmaKoala / The-Agent-Genome-Project

Star

An open-source engineering blueprint for defining and designing the core capabilities, boundaries, and ethics of any AI agent.

protocol specification standard autonomous-agents dev-tools agp ai-ethics agent-framework ai-agent agent-design llm llm-agents agent-architecture agent-safety

Updated Sep 6, 2025

Pro-GenAI / A2A-Agent-Action-Guard

Star

A2A version of Agent Action Guard: Safe AI Agents through Action Classifier

Updated Dec 14, 2025
Python

Skwert001 / Reams-Legality-Gate

Star

Energy based legality gating SDK for AI reasoning. Predicts, repairs, and audits collapse before it happens; reduces hallucinations and provides numeric audit logs.

middleware reliability audit compliance observability control-theory ai-safety llm reasoning-language-models agent-safety

Updated Oct 25, 2025

HKati / pulse-release-gates-0.1

Star

PULSE: deterministic release gates for AI safety

Updated Dec 27, 2025
Python

Improve this page

Add a description, image, and links to the agent-safety topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the agent-safety topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agent-safety

Here are 9 public repositories matching this topic...

SafellmHub / hguard-go

corv89 / shannot

Pro-GenAI / Agent-Action-Guard

aerosta / rewardhackwatch

Maxbanker / negentropy-constellation

KarmaKoala / The-Agent-Genome-Project

Pro-GenAI / A2A-Agent-Action-Guard

Skwert001 / Reams-Legality-Gate

HKati / pulse-release-gates-0.1

Improve this page

Add this topic to your repo