Skip to content

xinhuagu/AceClaw

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

456 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AceClaw

A self-learning agent harness for long-running work

CI Java 21 GraalVM Gradle 8.14

AceClaw exists because long-running tasks demand learning.

When an agent runs for minutes or hours, context is not enough. It must absorb experience while it works, reuse what succeeds, and govern what it learns so it does not become noisy or unsafe. The goal is to make an agent behave more like an experienced engineering system over time.

An agent harness is the orchestration layer that turns LLMs into persistent, self-correcting workers — the loop that reasons, acts, observes, recovers, and remembers. Most harnesses treat each session as a blank slate. AceClaw doesn't. It is built for long-running execution, where repeated failures, recoveries, tool sequences, and user corrections must become reusable knowledge instead of disappearing at the end of the session.

AceClaw Self-Learning Daemon Architecture

Read the design philosophy: why Java, why no AI framework, and what drives the architecture.

AceClaw is a persistent JVM daemon built for workflows that run for hours, not seconds. Pure Java 21, zero network attack surface, built from scratch around one idea:

Memory helps an agent remember. Self-learning helps an agent improve.

That is the spirit of AceClaw, and it drives four key differentiators:

  1. Plan → Execute → Replan — Most agent harnesses use a flat ReAct loop (think → act → observe, one step at a time). AceClaw generates an explicit task plan before execution, runs it step by step with per-step iteration budgets, and replans inline when steps fail. Plans are streamed to the user in real time. This gives AceClaw a structural advantage in long-running tasks — the agent has a visible roadmap instead of hoping the model stays on track turn by turn.
  2. Self-Learning — Zero-cost heuristic detectors and session-end retrospectives turn agent behavior into durable learning signals. The agent evolves its own strategies without extra LLM calls in the hot path.
  3. Security — UDS-only communication, sealed 4-level permissions, HMAC-signed memory
  4. Long-Term Memory — 8-tier hierarchy, hybrid search, automated consolidation

What makes this architecture different:

  • Daemon-first, not CLI-first — The JVM daemon persists across sessions. No cold start, no re-parsing config, no re-loading memory. The CLI is a thin JSON-RPC client over Unix Domain Socket.
  • Behavior-centric, not memory-centric — Most agent memory systems store facts. AceClaw observes behavior — error-recovery sequences, tool usage patterns, user corrections — and distills them into typed, confidence-scored insights. The agent doesn't just remember what happened; it learns how it should act differently next time.
  • Closed feedback loop — Detectors emit typed insights → insights accumulate confidence across sessions → high-confidence insights get persisted → persisted memory is injected back into the next run. Repeated corrections auto-promote from auto-memory (Tier 6) to workspace rules (Tier 3).
  • Everything is sealedInsight (5 permits), PermissionDecision (3 permits), MemoryTier (8 permits), StreamEvent, ContentBlock — the compiler enforces exhaustive handling everywhere. Adding a new variant is a compile error until all switches are updated.

Plan → Execute → Replan

Supported by research: Plan-Then-Execute (CHI 2025)

Most AI coding agents (Claude Code, OpenClaw, Codex CLI) rely on a flat ReAct loop — the model reasons and acts one step at a time. While effective for short tasks, this approach offers no explicit plan visibility and no structured failure recovery for long-running work.

AceClaw takes a fundamentally different approach: it layers an explicit planning pipeline on top of ReAct. Each individual step is still executed by the same ReAct loop (reason → act → observe), which remains the best mechanism for single-step tool use. The difference is that AceClaw wraps those steps in a higher-order plan that provides direction, budget control, and structured recovery — something a flat ReAct loop cannot do on its own.

Task → Complexity Estimator → Plan Generation (LLM) → Sequential Execution → Inline Replan
                                     │                        │                      │
                                     ▼                        ▼                      ▼
                              Structured JSON plan     Per-step iteration     On failure: executor
                              streamed to user         budgets                retries with fallback
                                                                              prompt or skips step
Component What it does
ComplexityEstimator Scores task complexity; only triggers planning above a configurable threshold
LLMTaskPlanner Generates a structured JSON plan with ordered, named steps
SequentialPlanExecutor Executes steps one by one with per-step iteration budgets, fallback support, and cancellation between steps

Why this matters for long tasks:

  • Visibility — The user sees "Step 3/7: Refactor authentication module" in real time, not a stream of opaque tool calls.
  • Structured recovery — When step N fails, the executor retries with a fallback prompt that includes the failure reason and remaining plan context.
  • Budget control — Each step has its own iteration budget, preventing any single step from consuming the entire session.

Planned (not yet implemented): Crash-safe plan checkpointing to disk, cross-session plan resumption, and wall-clock per-step budgets.

Security First

AceClaw defends across five dimensions:

  • Zero network surface — Daemon communicates only via Unix Domain Socket. No HTTP, no REST, no WebSocket.
  • Sealed permissions — 4-level hierarchy (READ/WRITE/EXECUTE/DANGEROUS) modeled as a sealed interface with compiler-enforced exhaustiveness. Sub-agents receive filtered tool registries to prevent privilege escalation.
  • Signed memory — Every persisted memory entry is HMAC-SHA256 signed with constant-time verification. Tampered entries are rejected on load.
  • Content boundaries — System prompt budget (150K char cap), tool result truncation (30K cap), and 8-tier priority ordering ensure human-authored content always outranks agent-generated memory.
  • Data protection — POSIX 600 on signing keys, SHA-256 hashed workspace paths, size governance with automatic consolidation.

See the Security Details for the full breakdown.

Self-Learning

AceClaw learns from its own behavior — no LLM calls required. Every tool execution, error recovery, and user correction is analyzed by heuristic detectors that produce type-safe insights.

  • Automatic pattern detectionErrorDetector matches tool failures to subsequent retries. PatternDetector identifies repeated sequences, error-correction pairs, and user preferences. SessionEndExtractor captures corrections and strategies via regex-based passes at session close.
  • Cross-session accumulation — Insights start at 0.4 confidence and gain +0.2 per recurrence. Only patterns reaching 0.7 confidence are persisted.
  • Strategy evolution — Errors become ErrorInsights, recurring sequences become SuccessInsights, unresolved errors become anti-patterns, and underperforming skills are refined or rolled back. A closed feedback loop: detect → persist → recall → refine.
  • Type-safe insight hierarchyInsight is a sealed interface (ErrorInsight | SuccessInsight | PatternInsight | RecoveryRecipe | FailureInsight). The compiler enforces exhaustive handling.
  • Strategy refinementStrategyRefiner generates anti-patterns from persistent failures, strengthens user preferences from repeated corrections, and rolls back underperforming strategies. SelfImprovementEngine orchestrates the full pipeline as an async post-turn hook.
  • Baseline evaluation — Continuous-learning KPIs and collection workflow are documented in docs/continuous-learning-plan.md with report templates and sample output.

See Self-Learning Pipeline for the full architecture.

Long-Term Memory

8-tier persistent memory hierarchy with HMAC-SHA256 signing, hybrid TF-IDF search, and 3-pass consolidation:

T1: Soul (identity)  →  T2: Managed Policy (enterprise)  →  T3: Workspace (ACECLAW.md)
T4: User Memory      →  T5: Local Memory (gitignored)     →  T6: Auto-Memory (JSONL+HMAC)
T7: Markdown Memory  →  T8: Daily Journal
  • HMAC-SHA256 integrity — Every entry is signed. Mutable fields excluded from payload so reads don't invalidate signatures.
  • 23 memory categories — From CODEBASE_INSIGHT and ERROR_RECOVERY to RECOVERY_RECIPE and FAILURE_SIGNAL.
  • 3-pass consolidation — Dedup, similarity merge (>80% threshold), age prune (90 days, zero access). Triggered by the learning maintenance scheduler after session-close extraction and indexing.
  • Workspace isolation — SHA-256 hashed paths under ~/.aceclaw/workspaces/. No cross-project leakage.

See Memory System Design for the full architecture.

Context Engineering

AceClaw actively manages what goes into the context window to keep long-running sessions effective:

User query → RequestFocus (symbol/file/plan extraction)
                ↓
System prompt → ContextAssemblyPlan (8-tier budget, priority ranking)
                ↓
Conversation  → Request-time pruning (transient, non-destructive)
                ↓
                → Context compaction (3-phase: prune → summarize → memory flush)
                ↓
Candidates    → CandidateStore (DRAFT → PROMOTED → IN_USE → ARCHIVED)
Component What it does
SystemPromptBudget Enforces 150K total char cap and 20K per-tier cap; truncates lowest-priority tiers first (70% head / 20% tail / 10% marker)
ContextAssemblyPlan Assembles the 8-tier memory hierarchy into a single system prompt, applying budget and priority ordering
RequestFocus Extracts symbols, file paths, and plan signals from each user query to boost relevant context sections
MessageCompactor.pruneForRequest() Produces a transient pruned copy of conversation for the LLM request without mutating session history
ContextEstimator Tracks token usage from API responses; triggers 3-phase compaction at 85% of effective context window
CandidateStore Manages memory candidate lifecycle (draft → promoted → in-use → archived) with exponential decay scoring

Observability — The /context CLI command calls context.inspect over JSON-RPC and displays: system prompt share percentage, per-section char/token counts, inclusion reasons, active file paths, and injected candidate IDs.

See Context Engineering for the full architecture.

Quick Start

One-Line Install

curl -fsSL https://raw.githubusercontent.com/xinhuagu/AceClaw/main/install.sh | sh

Downloads the latest pre-built release, extracts to ~/.aceclaw/, and adds commands to your PATH. Only requires Java 21 runtime (no build tools).

Configure & Run

export ANTHROPIC_API_KEY="sk-ant-api03-..."
aceclaw                # Start AceClaw (auto-starts daemon)

Or use OAuth (auto-discovered from Claude CLI credentials):

claude                 # Login via Claude CLI first
aceclaw                # Token auto-refreshes from Keychain

Commands

All commands installed by install.sh. Every command that accepts [provider] switches the LLM backend for that session.

Command What it does
aceclaw Start AceClaw TUI (auto-starts daemon if not running)
aceclaw-tui [provider] Open another TUI window — never restarts daemon, safe for multi-session
aceclaw-restart [provider] Stop daemon + restart with fresh build (warns if sessions active)
aceclaw-update Update to latest release (refuses if sessions active)

Supported providers: anthropic (default), copilot, openai, openai-codex, ollama, groq, together, mistral

Daemon Management

The daemon is a persistent JVM process that runs in the background. It auto-starts when you run aceclaw, but can be managed directly:

aceclaw daemon start              # Start daemon in background
aceclaw daemon start -p copilot   # Start background daemon with provider override
aceclaw daemon start --foreground # Start daemon in foreground (for debugging)
aceclaw daemon stop     # Gracefully stop daemon
aceclaw daemon status   # Show health, version, model, active sessions

Switching Providers

Pass the provider name as an argument to any launch command:

# Release install (symlinked commands)
aceclaw-restart copilot       # Restart daemon with GitHub Copilot
aceclaw-tui ollama            # Open TUI against local Ollama (no daemon restart)
aceclaw-restart anthropic     # Switch back to Anthropic Claude

# Or via environment variable (works with any command)
ACECLAW_PROVIDER=groq aceclaw

Provider Authentication

# Anthropic — API key or OAuth
export ANTHROPIC_API_KEY="sk-ant-api03-..."     # API key in env
# Or add to ~/.aceclaw/config.json: {"apiKey": "sk-ant-api03-..."}
# Or login via Claude CLI for OAuth token auto-refresh

# GitHub Copilot — uses your existing subscription
aceclaw-restart copilot                         # No extra key needed

# OpenAI / OpenAI Codex
export OPENAI_API_KEY="sk-..."
aceclaw-restart openai
# Or OAuth for Codex:
aceclaw models auth login --provider openai-codex
aceclaw-restart openai-codex

# Ollama (local, offline, no key needed)
aceclaw-restart ollama

# Groq / Together / Mistral
export OPENAI_API_KEY="gsk_..."                 # Provider-specific key
aceclaw-restart groq

See Provider Configuration for full setup details.

Build from Source (Developers)

git clone https://github.com/xinhuagu/AceClaw.git && cd AceClaw
./gradlew clean build && ./gradlew :aceclaw-cli:installDist
./aceclaw-cli/build/install/aceclaw-cli/bin/aceclaw-cli

Development scripts (from git checkout only — same provider argument support):

Script What it does
./dev.sh [provider] Rebuild + restart daemon + auto-benchmark on feature branches
./restart.sh [provider] Rebuild + restart daemon (no benchmarks, fastest restart)
./tui.sh [provider] Open TUI window (no restart, no rebuild if binary exists)
./dev.sh                    # Default: anthropic, with benchmarks on feature branches
./dev.sh --no-bench copilot # Copilot, skip benchmarks
./restart.sh ollama         # Quick restart with Ollama
./tui.sh                    # Attach to running daemon

See Multi-Session Model for details on running multiple TUI windows.

Platform Support

Platform Status IPC CI Gate
Linux Fully supported AF_UNIX pre-merge-check — full test suite (required)
macOS Fully supported AF_UNIX platform-smoke — build + cross-platform tests (required)
Windows 10 1803+ Experimental AF_UNIX (JEP 380) platform-smoke — build + cross-platform tests (required)

All three platform checks are required for merging to main. Windows requires Java 21 runtime and Windows 10 version 1803 or later (for AF_UNIX socket support). See Windows UDS Spike for technical details.

Tech Stack

Java 21 (preview features) · Gradle 8.14 · Picocli 4.7.6 · JLine3 3.27.1 · Jackson 2.18.2 · GraalVM Native Image · JUnit 5

License

Apache License 2.0

About

No description, website, or topics provided.

Resources

License

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages