2nd Year Computer Science Student at University College London (UCL).
Highlights
- Pro
Pinned Loading
-
emergent-misaligned-agents
emergent-misaligned-agents PublicIn this repository, we explore the notion of emergent misalignment in the context of tool-augmented large language models. Tested models are finetuned on partially incorrect datasets to induce misa…
Python
-
Protocol66
Protocol66 PublicProtocol 66 is an automated red‑teaming harness for AI agents. It inspects an agent’s configuration, synthesizes adversarial prompts tailored to its tools and permissions, simulates responses via C…
TypeScript
-
-
multi-step-agent-rl-infra
multi-step-agent-rl-infra PublicRL training infrastructure for multi-step web agents. Generates tasks at controllable difficulty (planning horizon), validates with oracle, evaluates efficiency-accuracy tradeoff
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.


