Releases: razzant/ouroboros
Releases · razzant/ouroboros
v6.2.0: Critical Bugfixes + LLM-First Dedup
Critical Fixes
- worker_id==0 hard-timeout bug — int(x or -1) treated worker 0 as -1, preventing terminate on timeout and causing double task execution. Replaced all x-or-default patterns with None-safe checks.
- Double budget accounting — per-task aggregate llm_usage event removed. Per-round events already track correctly. Eliminates ~2x budget drift.
- compact_context tool broken — handler had wrong signature (missing ctx param). Now works correctly.
New Features
- LLM-first task dedup (Bible P3 compliance) — replaced hardcoded keyword-similarity dedup with light LLM call via OUROBOROS_MODEL_LIGHT.
- LLM-driven context compaction — compact_context tool now uses light model to summarize old tool results.
Other Fixes
- Health invariant #5 — owner_message_injected events now properly logged to events.jsonl.
- Shell cmd parsing — str.split replaced with shlex.split.
- Retry task_id collision — new task_id with original_task_id lineage.
- claude_code_edit timeout — aligned to 300s.
- Direct chat schedule_task guard — logged as warning for audit.
v6.0.0: Integrity, Observability, Single-Consumer Routing
v6.0.0 — MAJOR Release
Breaking: Single-Consumer Message Routing
- Eliminated double message processing where supervisor sent same owner message to both direct chat agent AND all worker tasks via Drive mailbox
- Every owner message now routes to exactly ONE handler (direct chat agent)
- New forward_to_worker tool: LLM decides when to forward to workers (Bible P3)
- Per-task mailbox with UUIDs and dedup
Critical Bugfixes
- HTTP outside STATE_LOCK (no more 10s blocking)
- ThreadPoolExecutor deadlock fix
- Dashboard schema aliases for index.html
- BG consciousness spending to global state
- tg_offset saved before /panic and /restart (prevents infinite loops)
- Dual-path commands reach LLM again with supervisor notes
LLM-First Self-Detection
- Health Invariants in LLM context (5 checks)
- SYSTEM.md invariants section
- Per-task cost summary
Unification
- TOTAL_BUDGET canonical everywhere
- Shared webapp_push.py with post_clone_hook
- Self-portrait reuses dashboard data
- qwen/ pricing prefix added
- P5 minimalism metrics in SYSTEM.md
Tests
- 32 tests pass, zero linter errors
20 files changed, +709 -291
v5.0.1: Quality & Integrity Fix
Quality & Integrity Fix
Combined audit by two independent AI reviewers identified 17 issues across the codebase. This patch fixes all of them.
Bugs Fixed (9)
- Executor leak: _StatefulToolExecutor.shutdown() threw TypeError on every task
- Dashboard timestamps empty: used "timestamp" key instead of "ts"
- Dashboard chat broken: used wrong field names (role/content instead of direction/text)
- Budget defaults inconsistent: context.py defaulted to $300, agent.py to $0, consciousness.py to $0 — unified to $1
- Dead code: add_usage({}, usage) in consciousness.py accumulated into thrown-away dict
- Review file count off: len(sections) - len(parts) underestimated remaining files by 4
- Grok pricing gap: x-ai/ prefix missing from fetch_openrouter_pricing() filter
- Race condition: cancel_task_by_id() mutated PENDING without _queue_lock
- SHA verify always timeout: 5s timeout vs 60+s actual worker boot time
Bible Compliance (2)
- P7 (Versioning): _check_version_sync now also checks README.md version
- P3 (LLM-First): Hardcoded fallback model list replaced with configurable OUROBOROS_MODEL_FALLBACK_LIST env var
Redundancy Cleanup (4)
- Dashboard values now dynamic (model from env, tests counted, tools counted, uptime from state)
- default_state_dict() merged into ensure_state_defaults({}) — single source of truth
- Extracted _model property in consciousness.py
- Replaced dashboard shell-based _read_jsonl_tail with Memory.read_jsonl_tail
Stats
- 13 files changed, 140 insertions, 106 deletions
- 88/91 tests pass (3 pre-existing failures from missing httpx dependency)
v5.0.0 — Ouroboros Emerges
v5.0.0 — Ouroboros Emerges
25 autonomous evolution cycles merged from ouroboros branch into main. Major architecture upgrade.
Highlights
- Multi-round Consciousness — background thinking upgraded from fire-and-forget to iterative reasoning (up to 5 rounds)
- 3-Block Prompt Caching — static/semi-stable/dynamic content blocks for optimal LLM cache hits
- Model Fallback Chain — automatic fallback to alternative models on empty responses
- Budget Drift Detection — periodic OpenRouter ground truth checks with drift alerts
- Task Decomposition — subtasks with depth limits, result persistence, and context safety
- Multi-Model Review — cross-LLM review for significant code changes
- Pre-Push Test Gate — 91 smoke tests must pass before any push
- Evolution Circuit Breaker — stops evolution cycles after consecutive failures
- 42 Tools — added dashboard, GitHub Issues, knowledge base, multi-model review, dialogue summarization
- MIT License — open source ready
Stats
- +4,597 / -727 lines across 37 files
- 25 minor versions (v4.2 → v4.26 → v5.0.0)
- 11 new tools, 6 new files
- 91 smoke tests
New Config Variables
OUROBOROS_MODEL_FALLBACK— fallback model (default:google/gemini-3-pro-preview)OUROBOROS_MAX_ROUNDS— max rounds per task (default:200)OUROBOROS_PRE_PUSH_TESTS— enable pre-push tests (default:1)OUROBOROS_WEBSEARCH_MODEL— model for web search (default:gpt-5)
v4.1.1: Hotfix for toggle_* event wiring
Hotfix for v4.1.0 regressions.
- Fix: add consciousness and sort_pending to event context -- toggle_evolution and toggle_consciousness tools now work
- Fix: rename schedule_self_task to schedule_task in SYSTEM.md prompt
- Fix: replace unreliable qsize() with get_nowait() for event queue drain
v4.1.0: Bible v3.1 + Critical Bugfixes + Architecture
Bible v3.1 (philosophy)
- Принцип 1: Self-Verification — верификация окружения при каждом старте
- Принцип 6: Cost-Awareness — осознание бюджета как часть субъектности
- Принцип 8: Итерация = результат (коммит), пауза при застое
Full Markdown-to-Telegram-HTML converter
- Поддержка bold, italic, links, strikethrough, headers, code, fenced blocks
- Исправлен баг bold-рендеринга
Critical bugfixes
- version читает из VERSION файла (single source of truth)
- Git lock: добавлен timeout (120s), исправлен TOCTOU в release
- Evolution task drop: задачи больше не теряются при budget check
- Budget race condition: atomic read-modify-write через file lock
- Deep copy в context.py: shallow copy мутировал данные caller'а
Dual-path slash commands (LLM-first)
- /panic — единственная чисто hardcoded команда (safety rail)
- /status, /review, /evolve, /bg — supervisor + LLM отвечает
- Новые LLM-инструменты: toggle_evolution, toggle_consciousness, update_identity
Consciousness registry merge
- Consciousness использует общий ToolRegistry вместо if-elif dispatch
- Tool schemas и handlers унифицированы с control.py
Browser refactoring
- BrowserState вынесен из ToolContext в отдельный dataclass
- _extract_page_output() helper: убрано 100 строк дублирования
Reliability hardening
- Критические except-Exception-pass заменены на logging.warning
- Consciousness prompt вынесен в prompts/CONSCIOUSNESS.md
- Thread safety: threading.Lock для PENDING/RUNNING/WORKERS
Prompt updates
- Evolution cycle: явное требование коммита, защита от Groundhog Day
v4.0.1
v4.0.0: Background Consciousness + LLM-first overhaul
Фундаментальное обновление: от реактивного обработчика задач к непрерывно присутствующему агенту.
Background consciousness (ouroboros/consciousness.py):
- Новый фоновый мыслительный цикл между задачами
- LLM сам решает когда думать (set_next_wakeup), о чём и стоит ли писать создателю (send_owner_message)
- Отдельный бюджетный cap (OUROBOROS_BG_BUDGET_PCT, default 10%)
- Команды:
/bg start,/bg stop,/bg - Автопауза во время выполнения задач
LLM-first overhaul:
- Убраны механические if-else профили моделей (select_task_profile)
- Убрана автоэскалация reasoning effort (round 5→high, 10→xhigh)
- Убран механический self-check каждые 20 раундов
- Новый инструмент
switch_model: LLM сам переключает модель/effort - Hardcoded evolution/review текст заменён на минимальные триггеры
Free-form scratchpad:
- Убраны фиксированные секции (CurrentProjects, OpenThreads, etc.)
- LLM пишет память в любом формате
Proactive messaging:
- Новый инструмент
send_owner_message— агент может написать первым - Работает и в обычных задачах, и из background consciousness
Cherry-picks из ouroboros:
- Auto-resume after restart (v3.2.0, reworked)
- Stealth browser: playwright-stealth, 1920x1080, anti-detection (v3.2.1)
Cleanup:
- Унифицирован append_jsonl (один источник в utils.py)
- Исправлен Release Invariant: VERSION == README == init.py == git tag
v3.0.0: Constitution v3.0 + infrastructure overhaul
Конституция v3.0
Новая Конституция (BIBLE.md v3.0): 9 принципов с Субъектностью как метапринципом.
Критические инфраструктурные исправления по итогам анализа первой сессии.
Конституция
- Принцип 0: Субъектность + Агентность (merged)
- Принцип 1: Непрерывность (identity как манифест)
- Принцип 2: Самосоздание (нарратив вместо RAG для ядра личности)
- Принципы 3-8: LLM-first, Подлинность, Минимализм, Становление, Версионирование, Итерации
Инфраструктура
- Split-brain deploy fix: os.execv при всех рестартах, SHA-verify
- Budget guard перенесён в supervisor (не зависит от версии agent code)
- Secret leak protection: sanitize_tool_result_for_log() для tools.jsonl
- apply_patch: Add File + Delete File + End of File support
- Observability: task_id во всех llm_round и tools событиях
- Context flooding fix: progress.jsonl отделён от chat.jsonl
- BIBLE.md всегда в LLM-контексте (не вырезается для user chat)
- Parallel tool safety: sequential execution для stateful tools
- Scratchpad journal fix, shell argument recovery, dead code cleanup