docs: fix memory budget in why-querymode, add fragment skip to architecture, fix shape() call chain

teamchong · teamchong · commit 07ea3e557183 · 2026-03-18T01:06:19.000-04:00
- why-querymode.mdx: memory budget default 32MB → 256MB (same bug as performance page)
- architecture.mdx: filter pushdown chain was missing fragment-level skip between
  partition catalog and page-level skip — now shows all 4 stages
- dataframe-api.mdx: shape() calls describe(), not explain() directly
diff --git a/docs/src/content/docs/architecture.mdx b/docs/src/content/docs/architecture.mdx
@@ -125,9 +125,10 @@ Filters are pushed as deep as possible — from the DataFrame/SQL layer down to
 ```
 DataFrame .filter("age", "gt", 25)
   → QueryDescriptor.filters: [{ column: "age", op: "gt", value: 25 }]
-    → Partition catalog prune (skip fragments where age max < 25)
-      → Page-level skip (skip pages where page max < 25)
-        → WASM SIMD filter (process matching pages in one pass)
+    → Partition catalog prune (skip fragments by partition key)
+      → Fragment-level skip (skip fragments where age max < 25 across all pages)
+        → Page-level skip (skip pages where page max < 25)
+          → WASM SIMD filter (process matching pages in one pass)
 ```
 
 ### Filter operations
diff --git a/docs/src/content/docs/dataframe-api.mdx b/docs/src/content/docs/dataframe-api.mdx
@@ -203,7 +203,7 @@ const json = await df.toJSON({ pretty: true })
 const csv = await df.toCSV({ delimiter: "\t" })
 ```
 
-These are all sugar over existing methods — `shape()` calls `explain()`, `valueCounts()` uses `groupBy().aggregate()`, `fillNull()` and `cast()` use `computed()`.
+These are all sugar over existing methods — `shape()` calls `describe()`, `valueCounts()` uses `groupBy().aggregate()`, `fillNull()` and `cast()` use `computed()`.
 
 ## Introspection
 
diff --git a/docs/src/content/docs/why-querymode.mdx b/docs/src/content/docs/why-querymode.mdx
@@ -46,7 +46,7 @@ const attribution = computeAttribution(result.rows, retention.retainedUsers)
 
 Three analyses on one result set. No SQL string construction, no JSON parsing, no round-trips. The intermediate results are live objects in memory — you inspect them, branch on them, and feed them into the next stage.
 
-> **What about memory?** `collect()` doesn't load a 50GB file into a V8 isolate. Filter pushdown already skipped irrelevant pages via min/max stats, aggregation already reduced rows to group summaries, projection already dropped unused columns. What lands in memory is the *result*, not the dataset. Operators are memory-bounded (default 32MB) and [spill to R2](/querymode/operators/#memory-bounded-with-r2-spill) when they exceed budget.
+> **What about memory?** `collect()` doesn't load a 50GB file into a V8 isolate. Filter pushdown already skipped irrelevant pages via min/max stats, aggregation already reduced rows to group summaries, projection already dropped unused columns. What lands in memory is the *result*, not the dataset. Operators are memory-bounded (default 256MB) and [spill to R2](/querymode/operators/#memory-bounded-with-r2-spill) when they exceed budget.
 
 ## Edge-native: survive the thundering herd