Open
Conversation
Apply all approved changes from spec analysis review: - Fix API naming: delegate() is async, delegate_sync() wrapper (§5.10, §5.8) - Add ValidationResult dataclass (§7.3) - Align spawn/delegate terminology (§5.7) - Clarify SkillManager.register() accepts Skill/SkillInfo/Path (§5.2) - Remove unused permission-mode field from markdown config (§5.6) - Document hooks field as reserved, not v1-ready (§7.3) - Add reference loading API specification (§5.1) - Rename overloaded context → execution_mode (§7.3) - Clarify thread safety with asyncio.Lock (§5.2) - Specify skill deactivation via deactivate() method (§5.3) - Explain skills-ref tool with fallback strategy (§3.2, §6.4, §9.1) - Fix PrivateAttr syntax for _tools field (§7.3) - Add Release Plan section (§11) Generated analysis reports: .analysis.md, .analysis.html Spec bumped to v1.1
…h integration Add comprehensive skills and subagents subsystems to mamba-agents framework: Skills System: - SKILL.md file format with YAML frontmatter and two-tier loading (metadata, full) - Skill discovery from project/user/custom directories with priority levels - Skill registry with sync/async register/deregister/get/list/has methods - Skill validation with frontmatter parsing and trust level enforcement - Skill invocation with argument substitution ($ARGUMENTS, $N syntax) - SkillManager facade composing all skill subsystem components - SkillTestHarness testing utility for skill validation without LLM Subagents System: - Markdown-based subagent configuration in .mamba/agents/ directory - Agent spawner with context isolation, model inheritance, tool allowlist/denylist - Sync and async delegation with DelegationHandle for fire-and-forget execution - No-nesting enforcement to prevent sub-subagent creation - SubagentManager facade with config lifecycle and token aggregation - Per-subagent token usage tracking integrated with parent agent Integration: - Bi-directional skills-subagents wiring - Skill pre-loading into subagent system prompts - Skill tools registered with spawned subagents - Skill fork mode (context: fork) triggers subagent delegation - Circular reference detection for skill-subagent cycles Agent Facade Extensions: - Constructor params: skills, skill_dirs, subagents - Skill methods: register_skill, get_skill, list_skills, invoke_skill - Subagent methods: delegate, delegate_sync, delegate_async, register_subagent, list_subagents - Lazy initialization of SkillManager and SubagentManager Testing: - 2391 tests across 28 tasks (0 failures, 0 retries) - 92%+ coverage for skills/ and subagents/ modules - Unit, integration, and cross-cutting test scenarios - Skills-subagents integration tests with nested workflows Documentation: - Updated CLAUDE.md with architecture, file locations, design patterns - Updated CHANGELOG.md with feature entries - Updated root package __init__.py with new public exports Project Artifacts: - 76 files changed, 19549 insertions
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
…findings - Update codebase-analysis.md to reflect 7 subsystems (now includes SkillManager, SubagentManager) - Add new fragility points: async workaround in skills/integration.py, lazy property side effects - Refine architecture overview with updated file locations and critical files table - Add 7 actionable recommendations for hardening message_utils.py, TokenCounter handling, observability tests
…tal subsystems Add 10 new pages covering Skills and Subagents subsystems: - User guides: SKILL.md format, discovery, trust model, invocation, delegation patterns - API references: SkillManager/SubagentManager facades, configuration, error handling - Progressive disclosure architecture, fork execution mode, token tracking Update 8 existing pages: - Agent Basics: add Skills and Subagents sections with examples - Architecture: update module tree and mermaid diagrams - API Index: add Skills/Subagents imports and module table - Configuration: add SkillConfig reference section - Agent facade methods: add skill/subagent methods to mkdocstrings Add codebase analysis report: skills-subagents architecture, integration, challenges. All new pages marked experimental with warning admonitions. Nav updated with Skills and Subagents sections.
…t.run() Add comprehensive documentation explaining that skills are a content-preparation system separate from the pydantic-ai tool system. Skills cannot be invoked autonomously by the model during agent.run() in the current version. Changes: - Update Invocation Sources table to clarify MODEL as reserved infrastructure - Add "Skills and agent.run()" section with current behavior and workarounds - Document two practical patterns: prompt injection and manual tool bridging - Update CLAUDE.md Known Fragility Points and Skills System notes - Clarify that InvocationSource.MODEL and Skill._tools are unreachable Addresses implementation gap identified in skills subsystem analysis.
Add two comprehensive documents for v2 planning: - internal/reports/skills-subagents-v2-analysis.md: Deep analysis of v1 implementation identifying critical integration gaps, dead code, and fragile patterns (async bridging, tool resolution on private internals, missing wiring between subsystems) - internal/specs/skills-and-subagents-SPEC-v2.md: v2 specification addressing v1 issues with improved integration, explicit requirements, and implementation patterns These documents inform v2 feature prioritization and architecture decisions.
…n and async-first design ## Overview Complete implementation of the Skills & Subagents v2 specification across 13 parallel execution waves. Resolves 5 of 9 known fragility points through architectural decoupling, async-first design, and explicit initialization patterns. ## Major Changes ### Core Architecture (Task #18) - Remove bidirectional SkillManager ↔ SubagentManager dependency - Agent facade becomes sole mediator via skills/integration.py - SubagentManager accepts skill_registry instead of skill_manager - SkillManager.subagent_manager property removed entirely ### Public APIs (Task #17) - Add UsageTracker.record_subagent_usage(name, usage) public method - Encapsulates private _subagent_totals mutation logic - Enable clean subagent usage tracking without internal access ### Async-First Design (Tasks #22, #23, #25) - Make invoke_skill() async def, add invoke_skill_sync() wrapper for sync contexts - Redesign activate_with_fork() as fully async (removes ThreadPoolExecutor/asyncio.run) - Get skill function callback for circular detection without bidirectional references ### Explicit Initialization (Task #20) - Replace lazy property pattern with init_skills()/init_subagents() methods - Properties raise AttributeError when not initialized (with guidance) - Add has_skill_manager/has_subagent_manager checks without side effects - Double-init is idempotent, subagents blocked from init ### Tool System (Tasks #19, #24, #25) - Move ReAct final_answer registration from __init__ to run() with try/finally cleanup - Register invoke_skill pydantic-ai tool when skills initialized - Update tool description dynamically when skills registered/deregistered - Add deregister_skill() method with tool cleanup ### Regression Tests & Coverage (Tasks #28, #29) - 6 regression tests covering all resolved fragility points (13 test methods) - Add 26 tests to ReAct workflow bringing coverage 58% → 99% - Verify all 8 changed modules >= 80% coverage (avg 95%) - All modules now meet >= 80% threshold ### Documentation (Task #29) - Remove 5 resolved fragility points from CLAUDE.md - Update 1 partially resolved (SubagentManager/UsageTracker) - Keep 3 unresolved (message_utils, TokenCounter, pydantic-ai version) - Update Implementation Notes with v2 patterns ## Test Results - **Total tests**: 2,494 passing (172 new) - **New test files**: test_agent_invoke_skill_tool.py (31 tests), test_regression_fragility_points.py (13 tests) - **Modified test files**: 13 files updated for new API signatures - **Coverage**: All 8 target modules >= 80% (max 99% in subagents/manager.py and workflow.py) - **Zero failures, zero retries across all 13 tasks** ## Files Modified **Core (6 files)** - src/mamba_agents/agent/core.py - Explicit init, async invoke_skill, tool registration - src/mamba_agents/skills/integration.py - Async activate_with_fork, sole mediator - src/mamba_agents/skills/manager.py - Remove subagent_manager property - src/mamba_agents/subagents/manager.py - Accept skill_registry - src/mamba_agents/tokens/tracker.py - Add record_subagent_usage() - src/mamba_agents/workflows/react/workflow.py - Tool save/restore with try/finally **Tests (13 files)** - tests/unit/test_agent_invoke_skill_tool.py (NEW) - 31 tests for invoke_skill tool - tests/unit/test_regression_fragility_points.py (NEW) - 13 regression tests - tests/unit/test_workflows/test_react_workflow.py - 26 new tests for coverage - Plus 10 other test files updated for new API signatures **Docs (2 files)** - CLAUDE.md - Fragility points and implementation notes - docs/user-guide/subagents.md - Minor updates ## Wave Execution Summary | Wave | Tasks | Status | Duration | |------|-------|--------|----------| | 1 | #17,#18,#19 | 3/3 PASS | 12m 48s (longest) | | 2 | #20,#22,#27 | 3/3 PASS | 11m 30s | | 3 | #21,#23 | 2/2 PASS (verification only) | 1m 52s | | 4 | #24,#26 | 2/2 PASS | 5m 59s | | 5 | #25,#28 | 2/2 PASS | 2m 16s | | 6 | #29 | 1/1 PASS | 6m 32s | ## Fragility Points Resolved 1. **ReAct tool mutation** ✓ - Final answer registered in run() with cleanup 2. **Circular initialization** ✓ - Mediator pattern eliminates dependency 3. **Async workaround** ✓ - Native async throughout 4. **Lazy init side effects** ✓ - Explicit init_skills()/init_subagents() 5. **Skills not in agent.run()** ✓ - Invoke_skill tool registered, model can call Remaining fragility points (unchanged): - Message utils type-name matching (not in scope) - Duplicate TokenCounter in CompactionStrategy (not in scope) - pydantic-ai version sensitivity (ongoing concern) See CLAUDE.md for full fragility point analysis.
…de readability; add code-reviewer skill documentation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.