Skip to content

Add autonomous maintenance daemon with source monitoring and triage dispatch #504

@maxine-at-forecast

Description

@maxine-at-forecast

Summary

Build an autonomous maintenance daemon for crosslink that monitors external sources (GitHub issues, CI workflows, dependency releases, internal dev chatter, crosslink issues) and autonomously dispatches low-hanging-fruit tasks — with a human filter layer for anything beyond trivial.

This extends the spirit of the /maintain slash command (which runs a one-shot audit pass) into a persistent, event-driven system that continuously triages and acts.

Motivation

The /maintain skill already demonstrates the value of structured codebase maintenance (dependency audit, lint health, dead code, issue hygiene). But it's manual — someone has to remember to run it. Meanwhile, maintenance signals arrive continuously from many sources:

  • GitHub issues: bug reports, feature requests, community feedback
  • CI/CD: flaky tests, build failures, security alerts, Dependabot PRs
  • Dependency ecosystem: new releases, yanked crates, security advisories
  • Crosslink issues: stale issues, orphaned subissues, unlabeled work
  • Dev chatter: Slack/Discord mentions, PR review comments

Today these pile up and get batch-processed during periodic human triage. A daemon that watches these sources and autonomously handles the unambiguous cases (while flagging the rest for human review) would dramatically reduce maintenance toil.

Architecture Questions

Should this be a separate crate?

crosslink is already substantial (~102k lines, 153 files). The daemon would add:

  • Source adapters (GitHub API polling, webhook receivers, RSS/Atom for releases)
  • A rule engine for triage policies
  • Agent dispatch logic (spawning kickoff agents for approved tasks)
  • State management (what's been seen, what's in-flight, what's waiting for human review)

Options:

  1. Separate crate in the workspace (crosslink-maintain or crosslink-sentinel): clean boundary, own dependency tree, can evolve independently. Imports crosslink as a library for DB/issue/kickoff APIs.
  2. Module within crosslink: simpler build, shared types directly, but adds to the monolith.
  3. Separate binary that uses crosslink as a library: strongest separation, but requires stabilizing crosslink's internal APIs as a public interface.

Leaning toward (1) — workspace crate with a clear API boundary.

Level of generality

The daemon needs a plugin/adapter architecture for sources and actions. Key question: how generic should the rule engine be?

  • Minimal: hardcoded source→action mappings (e.g., "GH issue with agent-todo: replicate → spawn replication agent"). Fast to build, easy to reason about.
  • Policy-based: declarative rules in a config file (e.g., YAML/TOML policies like "when source=github AND label=agent-todo:replicate, dispatch=kickoff-replicate"). More flexible, but more complex.
  • Full workflow engine: arbitrary DAGs of condition→action steps. Maximum power, maximum complexity.

Suggest starting minimal with an eye toward policy-based as V2.

Concrete Test Case: agent-todo: replicate Dispatch

Current manual flow

  1. User files a GH issue describing a bug or behavior
  2. Human triages and adds agent-todo: replicate label as a signal that this is safe for an agent to attempt
  3. (Today: nothing happens automatically — someone has to notice the label and act)

Proposed automated flow

  1. Daemon polls GH issues (or receives webhook) for label changes
  2. Detects agent-todo: replicate was added to GH issue #N
  3. Creates a crosslink issue linked to GH #N
  4. Spawns a kickoff agent scoped to: "Reproduce the bug described in GH #N, write a failing test, and report findings"
  5. Agent runs in a sandboxed worktree with limited scope (read-only except for test files)
  6. On completion, daemon posts results back to the GH issue as a comment
  7. Human reviews the reproduction and decides next steps

This is a good first target because:

  • The human filter layer (agent-todo: label) is already in practice
  • The task is well-scoped (write a failing test, not fix the bug)
  • Success/failure is mechanically verifiable (test fails = reproduced)
  • Low blast radius (only touches test files)

Meta-Cognitive Flows for Software Maintenance

Beyond the concrete dispatch case, we should think about what maintenance patterns the daemon should eventually support:

Reactive flows (respond to external events)

  • Bug triage: GH issue → classify severity → attempt reproduction → report
  • Dependency alert: advisory published → check if affected → file issue or auto-patch
  • CI failure: build breaks on main → bisect → file issue with root cause
  • Stale PR: PR open >N days with no activity → ping author or close

Proactive flows (periodic sweeps)

  • Lint drift: run clippy/eslint weekly → fix new warnings → PR
  • Test coverage: measure coverage delta → identify uncovered new code → write tests
  • Doc freshness: check README/CHANGELOG against recent commits → flag stale sections
  • Issue hygiene: close stale issues, merge duplicates, add missing labels

Learning flows (improve over time)

  • Track dispatch outcomes: which agent-todo tasks succeeded vs failed?
  • Tune thresholds: adjust what counts as "low-hanging fruit" based on historical success rate
  • Surface patterns: "tests in module X break 3x more often than average" → file structural issue

Relationship to Existing Code

  • daemon.rs: Current daemon is a simple flush-interval loop for hydration. The maintenance daemon would either extend this or run as a separate process.
  • /maintain skill (resources/claude/commands/maintain.md): One-shot audit. The daemon's proactive flows are essentially /maintain sections running on a schedule.
  • crosslink kickoff: The dispatch mechanism — daemon spawns agents via kickoff for approved tasks.
  • GH Future: kickoff workflow expansions & VDD-IAR integration #23 (kickoff workflow expansions): Related future work on multi-agent coordination and remote execution.
  • GH Epic: Codebase maintainability audit — structural refactors for long-term health #498 (maintainability audit): Structural refactors that would make the codebase easier for the daemon to maintain.

Open Questions

  • What should the daemon be called? (crosslink sentinel, crosslink patrol, crosslink maintain daemon, etc.)
  • Should source adapters be compiled-in or dynamically loaded (WASM plugins, shared libs)?
  • How should the human filter layer work beyond labels? (Slack approval buttons? TUI approval queue? Web dashboard?)
  • What's the trust model for daemon-spawned agents? Should they have reduced permissions by default?
  • How does this interact with the existing crosslink trust system?

Suggested Milestones

  1. V0: Architecture design doc + separate crate skeleton + GH webhook adapter
  2. V1: agent-todo: replicate end-to-end flow (poll → dispatch → report)
  3. V2: Policy engine + 2-3 more source adapters (CI, dependencies)
  4. V3: Proactive flows (scheduled maintenance sweeps)
  5. V4: Learning flows (outcome tracking, threshold tuning)

Metadata

Metadata

Assignees

Labels

brainstormThinking about things that could be cool—in a noncommittal way!enhancementNew feature or requestfeatureNew feature✨future✨Items that are not on the immediate docket but stashed for a later point

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions