You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/design/agentic-infrastructure.md
+19-11Lines changed: 19 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,19 +15,25 @@ QueryMode covers Query. Source→Preprocess and Context Management are open gaps
15
15
-**Source→Preprocess**: Google indexes everything but their quality signals have been gamed for 20 years. AI-generated, SEO-optimized content scores high on PageRank but carries no information. Gemini inherits this — brilliant model, garbage source.
16
16
-**Context Management**: RAG is one-shot retrieval (embed → top-k → stuff). Misses structure, can't follow references, no progressive refinement. Hybrid search (BM25 + vector) just doubles down on the wrong model. Special formats (TOON, etc.) optimize the container but not the selection.
17
17
18
-
## Core insight: iteration
18
+
## Core insight
19
19
20
-
The main reason code navigation works for agents is that **you can iterate on it**. You can't iterate on RAG.
20
+
Agents handle huge codebases because code is addressable (file paths, function names), navigable (grep, imports), and DRY (define once, reference everywhere). The agent never loads everything — it navigates to what it needs via Glob → Grep → Read → refine.
21
21
22
-
RAG is a black box: embed → retrieve → stuff → hope. If the result is wrong, you can't:
23
-
- Diff what changed between two retrievals
24
-
- Version control your retrieval strategy
25
-
- Write a test that asserts "this query should return these documents"
26
-
- Gradually refine the result by narrowing the search
Code navigation is iterative: Glob → Grep → Read → refine → Read more. Each step is observable, testable, version-controllable. The agent sees what it got, decides if it's enough, and adjusts.
24
+
### Industry standard for agent memory
29
25
30
-
**Context management as code** means the same thing: the context selection logic is code you can diff, test, version, and iterate on. Not a pipeline you configure and pray.
No graphs, no query systems. Text files in the context window. Works at small scale (<200 lines, single repo). Breaks at scale — memory grows unbounded, drifts from reality, no selective retrieval. Nobody has solved this.
QueryMode's composable operator model already solves columnar data problems at scale. Documents are rows. Quality, freshness, embedding, topic — these are columns. Filter, dedup, aggregate, project — these are operators.
110
+
QueryMode solves the **Source → Preprocess** layer: crawl → dedup → quality score → store. Documents are rows, quality/freshness/embedding are columns, filter/dedup/aggregate are operators. This is a data problem and QueryMode is a data engine.
111
+
112
+
The **Context Management** layer is a different problem. The operators below (TokenBudget, SectionExtract, RedundancyFilter) are useful for document retrieval. The memory problem — how an agent accumulates and retrieves knowledge across sessions — remains unsolved industry-wide (see above). The industry standard is flat markdown files, and nothing has proven better at the scale of single-repo agent workflows.
105
113
106
-
Extend QueryMode with domain-specific operators to cover all three layers. Same engine, same API, same infrastructure. The key property: **every step is code** — testable, versionable, iterable.
114
+
QueryMode handles document ingestion and retrieval. Memory uses markdown files — same as every other shipping tool.
0 commit comments