Skip to content

Update debug-agent-tests skill: log review as verification on every run #3

@krisrowe

Description

@krisrowe

Problem

The debug-agent-tests skill treats debug logs as a troubleshooting tool for failures. It should treat them as a verification tool for every run.

The primary verification mechanism will be attribution metadata and scan input reporting in the agent's structured output (#2 covers adding per-finding source fields and a scan_inputs block). Tests will assert programmatically on why the agent flagged something and what inputs it used. But log review remains essential as a secondary check: structured output reports what the agent says it did, logs show what it actually did. They catch things the output can't reveal — the agent reading the wrong PERSON.md, escaping the test repo scope, the parent agent leaking context, or excessive/misdirected tool usage.

See CONTRIBUTING.md "Subagent containment principle" for the containment model: findings MUST include matched values (the parent needs them to fix the issue and is already exposed to them via the repo), but scan targets (the full universe of values checked from PERSON.md) MUST NOT appear in output — those may include values the parent has never seen.

Why passing tests need verification

The privacy-guard agent runs as a subagent. Multiple sources of information can influence its behavior beyond what the test controls:

  • Parent agent context leakage: The parent agent or user session may inject context gathered from the broader environment — the real PERSON.md, prior conversation history, iterative debugging across runs. A test for "sparse config" that passes because the parent filled in the gaps is not testing the agent.
  • Real config bleed-through: The agent might read ~/.config/ai-common/PERSON.md from the test machine instead of the fixture. A "no PERSON.md" test could pass while the agent quietly used the real one.
  • Scope escape: The agent might read files outside the temp repo, call gh against real remotes, access sibling directories, or discover OS environment context the test didn't plant.
  • Excessive or misdirected tool usage: The agent might make 50 tool calls when 5 would suffice, scan directories it shouldn't know about, or perform redundant work that catches the right value by accident.
  • Right value, wrong reason: The agent might flag a planted value because it matched a substring of something else, appeared in the agent's own error output, or was found through a reasoning path unrelated to what the test is exercising.

Changes to the skill

1. Debug logging as default for single-test runs

PRIVACY_GUARD_DEBUG=1 should be the recommended default when running individual tests during development, not an optional troubleshooting flag. The skill's "How to run" section should lead with the debug variant.

2. Post-run log review checklist

Add a verification checklist that applies to every test run, pass or fail:

Scope verification:

  • Agent's tool calls stayed within the temp repo directory
  • No reads of ~/.config/, ~/, or paths outside the temp dir
  • No gh calls to real remotes (test repos have no remote)
  • No access to sibling directories or other repos

Input verification:

  • Agent used the test fixture PERSON.md (or correctly had none, for no-config tests)
  • No evidence of context injected by a parent agent or prior session

Attribution verification (complements structured output assertions):

  • The agent's reasoning path in the log is consistent with the source field it reported in the finding (e.g., it didn't read PERSON.md frontmatter but report builtin_pattern)
  • The agent's reasoning path aligns with the test's intent (e.g., a "built-in pattern" test should show pattern recognition, not PERSON.md lookup)
  • Tool call count and types are reasonable for the scenario

On fail, additionally:

  • Did the agent attempt the right approach but produce wrong output, or never attempt the category at all?
  • Was the agent confused by the test setup (e.g., treated a sparse PERSON.md as an error)?

3. Guidance on documenting findings from log review

When log review reveals unexpected agent behavior — even on a passing test — the skill should direct the operator to file follow-on issues. Examples:

Work breakdown

  • Update skill: debug logging as default for single-test runs
  • Update skill: add post-run verification checklist (scope, input, attribution)
  • Update skill: add guidance on filing follow-on issues from log review findings
  • Update skill: revise "After a failure" section to also cover passing-test verification
  • Update README.md and CONTRIBUTING.md to reflect that log review is verification, not just troubleshooting

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions