Skip to content

Commit f589bf9

Browse files
committed
docs: add nextColumnar() implementation guide for custom operators
New section in operators.mdx with working example (DoublingOperator), explains pipeline ColumnarBatch shape (Map + selection vector), links to columnar-format page for QMCB distinction, and guidance on when to implement vs skip.
1 parent c7fade4 commit f589bf9

File tree

1 file changed

+42
-0
lines changed

1 file changed

+42
-0
lines changed

docs/src/content/docs/operators.mdx

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,48 @@ For numeric columns, the `WasmAggregateOperator` uses Zig SIMD vector instructio
108108

109109
The DataFrame API automatically selects WASM aggregates when available.
110110

111+
## Columnar fast path (nextColumnar)
112+
113+
Operators can optionally implement `nextColumnar()` alongside `next()` for zero-copy performance. When the pipeline detects a columnar-capable operator chain, it uses `nextColumnar()` to pass column data directly without creating `Row[]` objects.
114+
115+
```typescript
116+
import type { Operator, RowBatch, ColumnarBatch, DecodedValue } from "querymode"
117+
118+
class DoublingOperator implements Operator {
119+
constructor(private source: Operator, private column: string) {}
120+
121+
// Standard path — always required
122+
async next(): Promise<RowBatch | null> {
123+
const batch = await this.source.next()
124+
if (!batch) return null
125+
for (const row of batch) row[this.column] = (row[this.column] as number) * 2
126+
return batch
127+
}
128+
129+
// Columnar path — optional, avoids Row[] creation
130+
async nextColumnar(): Promise<ColumnarBatch | null> {
131+
if (!this.source.nextColumnar) return null
132+
const batch = await this.source.nextColumnar()
133+
if (!batch) return null
134+
const col = batch.columns.get(this.column)
135+
if (col) {
136+
for (let i = 0; i < col.length; i++) {
137+
col[i] = ((col[i] as number) ?? 0) * 2
138+
}
139+
}
140+
return batch
141+
}
142+
143+
async close() { await this.source.close() }
144+
}
145+
```
146+
147+
The pipeline `ColumnarBatch` (from `operators.ts`) uses `Map<string, DecodedValue[]>` for columns and an optional `selection?: Uint32Array` for post-filter row indices. This is distinct from the QMCB wire-format `ColumnarBatch` documented in [Columnar Format](/columnar-format/) — see that page for the distinction.
148+
149+
When to implement `nextColumnar()`:
150+
- **Do**: numeric transforms on large datasets (avoids `Row[]` allocation)
151+
- **Skip**: operators that reshape data (joins, aggregates) or need random access across rows
152+
111153
## Why composable operators matter
112154

113155
Traditional engines give you a fixed query language. You can't put a window function before a join, run custom logic between pipeline stages, or swap the sort implementation.

0 commit comments

Comments
 (0)