|
| 1 | +--- |
| 2 | +title: Lazy Evaluation |
| 3 | +description: Execution model — when code runs, eager vs lazy, streaming iteration. |
| 4 | +--- |
| 5 | + |
| 6 | +## Execution model |
| 7 | + |
| 8 | +DataFrame methods like `.filter()`, `.sort()`, `.limit()` do not execute anything. They build a `QueryDescriptor` — a plain object describing what to do. Execution happens when you call a **terminal method**. |
| 9 | + |
| 10 | +```typescript |
| 11 | +// Nothing executes here — just builds a descriptor |
| 12 | +const query = qm.table("events") |
| 13 | + .filter("status", "eq", "active") |
| 14 | + .sort("created_at", "desc") |
| 15 | + .limit(100) |
| 16 | + |
| 17 | +// Execution happens HERE |
| 18 | +const result = await query.collect() |
| 19 | +``` |
| 20 | + |
| 21 | +## Terminal methods |
| 22 | + |
| 23 | +| Method | What it does | When to use | |
| 24 | +|--------|-------------|-------------| |
| 25 | +| `.collect()` | Execute and return all matching rows | Default — most queries | |
| 26 | +| `.exec()` | Alias for `.collect()` | Same thing | |
| 27 | +| `.first()` | Return first matching row or null | Existence check or single lookup | |
| 28 | +| `.count()` | Return row count without materializing | Counting without data transfer | |
| 29 | +| `.exists()` | Return true if any row matches | Cheapest existence check | |
| 30 | +| `.lazy()` | Return a `LazyResultHandle` for paging | Large results, on-demand pages | |
| 31 | +| `.cursor()` | Return `AsyncIterable<Row[]>` for streaming | Process rows in batches without loading all | |
| 32 | +| `.explain()` | Return query plan without executing | Debugging, inspect pruning | |
| 33 | + |
| 34 | +## Lazy result handle |
| 35 | + |
| 36 | +`.lazy()` returns a `LazyResultHandle` that executes pages on demand: |
| 37 | + |
| 38 | +```typescript |
| 39 | +const handle = await qm.table("events") |
| 40 | + .filter("status", "eq", "active") |
| 41 | + .sort("created_at", "desc") |
| 42 | + .lazy() |
| 43 | + |
| 44 | +// Fetch page 0 (rows 0-99) |
| 45 | +const page0 = await handle.page(0, 100) |
| 46 | + |
| 47 | +// Fetch page 3 (rows 300-399) |
| 48 | +const page3 = await handle.page(300, 100) |
| 49 | + |
| 50 | +// Fetch a single row |
| 51 | +const row42 = await handle.row(42) |
| 52 | + |
| 53 | +// Full materialization if needed |
| 54 | +const all = await handle.collect() |
| 55 | +``` |
| 56 | + |
| 57 | +Each `.page()` call is a separate query execution with `offset` and `limit`. No state is held between pages — the handle re-executes the query each time. This means: |
| 58 | +- Pages can be fetched in any order |
| 59 | +- No memory accumulates between pages |
| 60 | +- Sorted results are consistent if data doesn't change |
| 61 | + |
| 62 | +## Streaming iteration |
| 63 | + |
| 64 | +`.stream()` on a lazy handle yields batches until exhausted: |
| 65 | + |
| 66 | +```typescript |
| 67 | +const handle = await qm.table("events").lazy() |
| 68 | + |
| 69 | +for await (const batch of handle.stream(500)) { |
| 70 | + // batch is Row[] with up to 500 rows |
| 71 | + process(batch) |
| 72 | + // Break early to stop fetching |
| 73 | + if (done) break |
| 74 | +} |
| 75 | +``` |
| 76 | + |
| 77 | +## Cursor |
| 78 | + |
| 79 | +`.cursor()` is similar but works directly on the DataFrame without an intermediate handle: |
| 80 | + |
| 81 | +```typescript |
| 82 | +for await (const batch of qm.table("events").cursor({ batchSize: 1000 })) { |
| 83 | + await processBatch(batch) |
| 84 | +} |
| 85 | +``` |
| 86 | + |
| 87 | +Requires an executor with cursor support (e.g., edge mode). Throws if not available. |
| 88 | + |
| 89 | +## Keyset pagination |
| 90 | + |
| 91 | +For large sorted datasets, offset-based pagination gets slower as offset grows (the engine must skip N rows). Keyset pagination uses the last seen value to start the next page: |
| 92 | + |
| 93 | +```typescript |
| 94 | +// First page |
| 95 | +const page1 = await qm.table("events") |
| 96 | + .sort("id", "asc") |
| 97 | + .limit(50) |
| 98 | + .collect() |
| 99 | + |
| 100 | +// Next page — starts after the last id |
| 101 | +const lastId = page1.rows[page1.rows.length - 1].id |
| 102 | +const page2 = await qm.table("events") |
| 103 | + .sort("id", "asc") |
| 104 | + .after(lastId) |
| 105 | + .limit(50) |
| 106 | + .collect() |
| 107 | +``` |
| 108 | + |
| 109 | +`.after(value)` translates to a `gt` filter on the sort column, which benefits from page-level skip. Every page is equally fast regardless of depth. |
0 commit comments