The Problem
In a real session tonight (aic-director-of-ai-cockpit session #8), an agent used research.md to file 20+ findings across multiple projects. Every single one was graded LOW with no primary source material read. The agent:
- Asked Gemini a question, filed the summary as a finding
- Wrote one-sentence claims with no source verification
- Used research.md as a clipboard, not a research tool
- Never fetched a URL, never read a primary document, never hashed content
- Filed findings in seconds that should have taken hours
The evidence grade system (UNVERIFIED → LOW → REASONED → CONFIRMED) is supposed to gate quality, but LOW requires nothing. The tool accepts 'Gemini said so' as sufficient. There is no enforcement of actual research.
The Impact
- 20+ findings exist that are just AI-summarized opinions
- None of them have been validated against primary sources
- Decisions were almost made based on these hollow findings
- The user caught it and called it 'hot garbage'
Connection to Existing Work
PROPOSAL-evidence-gathering-gates.md already identifies this problem. This issue provides a concrete failure case that proves the proposal is needed.
What Should Change
- LOW should require at least one fetched URL — not just a claim that a URL exists
- The tool should refuse findings with no web research — force the researcher to search before recording
- UNVERIFIED should be the only grade available without source material — and UNVERIFIED findings should be flagged as needs-investigation not filed
- Consider removing LOW entirely — force researchers to earn at least REASONED (requires content_hash)
The tool should make shallow research HARD and deep research EASY. Right now it is the opposite.
Session Reference
aic-director-of-ai-cockpit session #8, 2026-03-30/31. ASMP research projects 000-006.
The Problem
In a real session tonight (aic-director-of-ai-cockpit session #8), an agent used research.md to file 20+ findings across multiple projects. Every single one was graded LOW with no primary source material read. The agent:
The evidence grade system (UNVERIFIED → LOW → REASONED → CONFIRMED) is supposed to gate quality, but LOW requires nothing. The tool accepts 'Gemini said so' as sufficient. There is no enforcement of actual research.
The Impact
Connection to Existing Work
PROPOSAL-evidence-gathering-gates.md already identifies this problem. This issue provides a concrete failure case that proves the proposal is needed.
What Should Change
The tool should make shallow research HARD and deep research EASY. Right now it is the opposite.
Session Reference
aic-director-of-ai-cockpit session #8, 2026-03-30/31. ASMP research projects 000-006.