Skip to content

Commit 25b7d39

Browse files
committed
docs: add technical depth to Why QueryMode, address arch criticisms
- Clarify thundering herd vs flexibility as separate problems in point 1 - Add memory blockquote after collect() example - Add "How it actually works" section: storage layer, operator pipeline, governance model - Add closing paragraph on composable optimizer
1 parent 7a27484 commit 25b7d39

File tree

1 file changed

+15
-15
lines changed

1 file changed

+15
-15
lines changed

docs/src/content/docs/why-querymode.mdx

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Why QueryMode
33
description: Agents are the new users. They need dynamic pipelines, not pre-built ETL.
44
---
55

6-
1. **Agents are becoming the majority of internet traffic.** They serve different owners across different parts of the world, but share the same training data and independently reach the same conclusions. When thousands of agents hit the same endpoints at the same millisecond, the result is thundering herds that look like a DDoS — except every request is legitimate. That's not a DDoS attack. That's just Tuesday. Data must live at the edge to survive this.
6+
1. **Agents are becoming the majority of internet traffic.** They serve different owners across different parts of the world, but share the same training data and independently reach the same conclusions. When thousands of agents hit the same endpoints at the same millisecond, the result is thundering herds that look like a DDoS — except every request is legitimate. That's not a DDoS attack. That's just Tuesday. When 10,000 agents ask the same question, regional Query DOs serve the cached result — that's a CDN problem, not a query engine problem. When they ask *different* questions, you need composable pipelines that didn't exist five minutes ago. QueryMode handles both.
77

88
2. **Agents need live data.** Decisions based on outdated training data lead to bad outcomes. Training data can't keep up with the speed the world produces information. Agents make API calls for live data — lots of them. That data needs to live at the edge, close to where agents run.
99

@@ -52,39 +52,39 @@ const attribution = computeAttribution(result.rows, retention.retainedUsers)
5252

5353
Three analyses on one result set. No SQL string construction, no JSON parsing, no round-trips. The intermediate results are live objects in memory — you inspect them, branch on them, and feed them into the next stage.
5454

55-
> **What about memory?** `collect()` doesn't load a raw 50GB file into a V8 isolate. By the time data reaches `collect()`, it has already passed through the operator pipeline — filter pushdown skipped irrelevant pages using min/max stats, aggregation reduced millions of rows to group summaries, and projection dropped unused columns. What lands in memory is the *result*, not the dataset. For the rare case where the result itself is large, operators are memory-bounded (default 32MB) and [spill to R2](/operators#memory-bounded-with-r2-spill) when exceeded.
55+
> **What about memory?** `collect()` doesn't load a 50GB file into a V8 isolate. Filter pushdown already skipped irrelevant pages via min/max stats, aggregation already reduced rows to group summaries, projection already dropped unused columns. What lands in memory is the *result*, not the dataset. Operators are memory-bounded (default 32MB) and [spill to R2](/operators#memory-bounded-with-r2-spill) when they exceed budget.
5656
5757
## How it actually works under the hood
5858

5959
### Where the data lives
6060

61-
Data sits in **R2 object storage** as columnar files (Parquet, Lance, Iceberg, CSV, JSON, Arrow). It does not get replicated to 300 edge nodes. Instead, QueryMode caches **metadata at the edge**table footers (~4KB each) in regional Query DOs — and reads data pages from R2 via coalesced HTTP range requests (~10ms per read).
61+
Data sits in **R2** as columnar files (Parquet, Lance, Iceberg, CSV, JSON, Arrow). Nothing gets replicated to 300 edge nodes. Regional Query DOs cache table footers (~4KB each) and read data pages from R2 via coalesced HTTP range requests (~10ms).
6262

63-
"Data at the edge" means: metadata cached locally, data fetched on demand from R2 with free egress. Not replicated databases.
63+
"Data at the edge" means metadata cached locally, pages fetched on demand with free egress. Not replicated databases.
6464

6565
### The operators ARE the optimizer
6666

67-
QueryMode doesn't throw away query optimization — it makes it composable. Every query runs through a pull-based [operator pipeline](/operators):
67+
Every query runs through a pull-based [operator pipeline](/operators):
6868

6969
```
7070
ScanOperator → FilterOperator → AggregateOperator → TopKOperator → ProjectOperator
7171
```
7272

73-
This pipeline does the same work a traditional optimizer does:
73+
These do real query optimization work:
7474

75-
- **Page-level skip** — min/max stats prune pages before reading them
76-
- **Predicate pushdown** — filters evaluate inside the WASM engine, not in JavaScript
77-
- **SIMD vectorized decode** — Zig WASM engine processes columns with SIMD instructions
75+
- **Page-level skip** — min/max stats prune pages before reading
76+
- **Predicate pushdown** — filters run inside the WASM engine, not JavaScript
77+
- **SIMD vectorized decode** — Zig WASM processes columns with SIMD instructions
7878
- **Coalesced I/O** — adjacent page reads merge into single range requests
79-
- **Prefetch**fetches page N+1 while decoding page N (up to 8 in-flight)
80-
- **Partial aggregation** — Fragment DOs aggregate locally, Query DO merges results
81-
- **Memory-bounded spill** — sort and join operators spill to R2 via Grace hash partitioning when they exceed their budget
79+
- **Prefetch**fetch page N+1 while decoding page N (up to 8 in-flight)
80+
- **Partial aggregation** — Fragment DOs aggregate locally, Query DO merges
81+
- **Memory-bounded spill** — sort and join spill to R2 via Grace hash partitioning when they exceed budget
8282

83-
The difference from a traditional optimizer: you can see every stage, swap implementations, inject custom logic between operators, and control the memory budget. The query plan isn't a black box — it's your code.
83+
The difference: you can see every stage, swap implementations, inject custom logic between operators, and control the memory budget. The plan isn't a black box.
8484

8585
### Governance
8686

87-
Dynamic pipelines don't mean ungoverned access. Access control is at the **table and column level**, not the pipeline level. The agent composes freely — but only over data it's authorized to touch. `MasterDO` owns table metadata and controls which tables exist. The agent can't query what isn't registered.
87+
"The pipeline doesn't exist until the agent creates it" sounds terrifying if you're a CISO. But the pipeline is just operator composition — it doesn't grant access to anything. `MasterDO` owns table metadata. The agent can only query tables that are registered, and only columns that are exposed. Row-level and column-level access control happens before the pipeline runs, not inside it.
8888

8989
The transformation is dynamic. The authorization is not.
9090

@@ -102,4 +102,4 @@ Both test suites also include multi-step analyses that would be awkward with the
102102

103103
QueryMode doesn't eliminate transformation. It moves it from a pre-built schedule to query time. The agent decides what to query, how to transform it, and what to do with the result — all in the same code, same process. If the data is well-structured, the agent queries it directly. If it's not, the agent builds the transformation on the spot. Either way, no one had to anticipate the question in advance.
104104

105-
It also doesn't eliminate the query optimizer. It replaces a fixed one with a composable one. The operators do the same work — filter pushdown, vectorized decode, memory-bounded spill — but you assemble them, you control the budget, and you can put a [ML scoring function between pipeline stages](/operators#compose-operators-directly) if you want to.
105+
It doesn't eliminate the query optimizer either. The operators do filter pushdown, vectorized decode, memory-bounded spill — but you assemble them, you control the budget, and you can put an [ML scoring function between pipeline stages](/operators#compose-operators-directly) if you want to.

0 commit comments

Comments
 (0)