This document describes the security model of AgentEval.jl and its implications.
AgentEval uses STDIO transport exclusively. This is a deliberate design choice for security:
| Risk | HTTP/TCP Servers | STDIO (AgentEval) |
|---|---|---|
| Network access | Exposed on port | Not applicable |
| Remote attacks | Possible | Not possible |
| Port scanning | Discoverable | Not discoverable |
| Multi-user access | Other users can connect | Process-isolated |
| Firewall bypass | May be needed | Not needed |
- The MCP client (Claude Code) spawns AgentEval as a subprocess
- Communication happens via stdin/stdout pipes
- No network sockets are opened
- The Julia process inherits the user's permissions
- When the Claude session ends, the Julia process terminates
This architecture means:
- No network attack surface - There's no port to scan or exploit
- No authentication needed - Only the parent process can communicate
- Automatic cleanup - Process dies when session ends
AgentEval evaluates arbitrary Julia code. This is by design - it's the core functionality for AI agent workflows.
- TTFX overhead - Solved by persistent session
- Process isolation - Each Claude session gets its own Julia process
- Output capture - Stdout/stderr are captured and returned, not leaked
| Risk | Status | Mitigation |
|---|---|---|
| Malicious code execution | Not protected | Review AI-generated code |
| File system access | Full access | Use sandboxed environment |
| Network access from Julia | Full access | Use network policies |
| Resource exhaustion | No limits | Monitor resource usage |
| Environment variable access | Full access | Sanitize environment |
AgentEval trusts the MCP client to send reasonable code. In practice:
- Claude Code decides what code to run - AgentEval executes it
- The user reviews AI suggestions - Claude shows code before running
- Permissions flow from user - AgentEval runs with user's permissions
This is similar to running julia -e "..." manually - the code runs with your permissions.
MCPRepl.jl uses HTTP transport on port 3000:
- Risk: Any process on the machine can connect
- Risk: Network attacks possible if port is exposed
- Risk: Could not be registered in Julia General due to security concerns
The kahliburke fork adds security features:
- API key authentication
- IP allowlisting
- Security modes (strict/relaxed/lax)
However, it still opens a network port.
AgentEval avoids these issues by not opening any network port:
MCPRepl.jl:
[Internet] → [Firewall] → [Port 3000] → [Julia REPL]
↑
Attack surface exists
AgentEval:
[Claude Code] ⟷ [stdin/stdout] ⟷ [Julia]
↑
No network attack surface
AgentEval is designed for local development workflows:
# This is the intended use case
claude mcp add julia-eval -- julia --project=/path/to/AgentEval.jl ...If you need to run AgentEval in a shared or production environment:
- Use containers - Run Julia in a Docker container with limited permissions
- Use seccomp/AppArmor - Restrict system calls available to Julia
- Monitor execution - Log all code executed via AgentEval
- Set resource limits - Use ulimit or cgroups to limit CPU/memory
- Sanitize environment - Remove sensitive environment variables
- Don't expose AgentEval to the network - It's designed for local use
- Don't run with elevated permissions - Use a regular user account
- Don't store secrets in environment variables - Julia can read them
- Don't rely on AgentEval for sandboxing - It executes arbitrary code
If you discover a security vulnerability in AgentEval, please:
- Do not open a public issue
- Contact the maintainers privately
- Provide details about the vulnerability
- Allow time for a fix before public disclosure
This software is provided "AS IS" without warranties. It executes arbitrary code with user permissions. Use at your own risk.
See LICENSE for the full Apache 2.0 license terms.