Autonomous AI agent powered by GPT-oss via local vLLM. Full shell access in Docker containers.
Minimize agent code and fancy orchestration. Give the LLM an environment where it has power to solve problems on its own - GPT OSS models interleave thinking and tool calling well enough.
- Rust backend (Axum) with TOTP authentication
- openai-harmony for proper Harmony format encoding/parsing
- Docker containers per conversation with mounted workspaces
- Polling UI - no websockets, just simple HTTP
- Fire-and-forget - POST returns 201, work happens in background
- Svelte fronte-end - a simple svelte frontend that should be easy to customize
- Rust 1.75+
- Docker
- vLLM running with gpt-oss model
- node
- Make sure vLLM is running:
# Your existing vLLM serve command for gpt-oss-120b- Build and run:
cd agent
cd frontend
npm i
npm run build
cd..
cargo build --release
# Set environment variables
export VLLM_URL="http://localhost:8000"
export MODEL="gpt-oss-120b" # or whatever your model is named
export DATA_DIR="./data"
export WORKSPACE_ROOT="./workspaces"
# Run
./target/release/agent-
First visit: scan QR code with authenticator app (Google Authenticator, Authy, etc.)
-
Enter 6-digit code to authenticate
Add to your nginx config:
server {
listen 443 ssl;
server_name agent.yourdomain.com;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
location / {
proxy_pass http://localhost:7778;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}- You send a message
- Server creates a Docker container for the conversation (if new)
- Message goes to gpt-oss via vLLM with Harmony encoding
- Agent can execute shell commands via
execute()tool - Loop continues until agent gives final response (no tool call)
- UI polls for updates
Each conversation gets its own workspace:
- Container path:
/workspace - Host path:
$WORKSPACE_ROOT/<conversation-id>/
Files created by the agent persist on your host.
| Variable | Default | Description |
|---|---|---|
| VLLM_URL | http://localhost:8000 | vLLM server URL |
| MODEL | gpt-oss-120b | Model name in vLLM |
| DATA_DIR | ./data | Where auth.json and conversations.db live |
| WORKSPACE_ROOT | ./workspaces | Where agent workspace directories are created |
# Run with debug logging
RUST_LOG=agent=debug cargo run
# Watch for changes
cargo watch -x run- TOTP is the only authentication (no passwords)
- Sessions are in-memory (restart = re-authenticate)
- Docker containers have network access by default
- Containers run as root inside (sandboxed, but be aware)
Copyright (c) 2025 Patrick De La Garza. AGPL v3, see LICENSE.
- At home, this runs alongside a local vllm deployment and is only available through my vps - make more flexible for others by allowing vllm api-key configuration.
- Allow configurable container network restrictions