Optimize rewrite performance#115
Conversation
There was a problem hiding this comment.
Pull request overview
This PR optimizes SpjNormalForm::rewrite_from and related normalization paths to significantly reduce allocation and traversal overhead, especially for wide schemas with many projected columns.
Changes:
- Cache
output_schema.columns()once inrewrite_fromand reuse it inside the output expression loop to avoid repeatedVecallocations. - Introduce fast paths for column handling:
Predicate::normalize_column, a column-specialized branch inPredicate::normalize_expr, andSpjNormalForm::find_output_columnplus aColumn-only branch inrewrite_from. - Extend the test suite in
normal_form.rswith additional async tests targeting predicate normalization and wide-table rewrite behavior.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/rewrite/normal_form.rs
Outdated
| // Fast path: if it's a simple Column, avoid full transform traversal | ||
| if let Expr::Column(ref c) = e { | ||
| return Expr::Column(self.normalize_column(c)); | ||
| } |
There was a problem hiding this comment.
If it's a simple column, even if it goes through the transform, it'll be fast imo, I'm curious why this reduces cost
There was a problem hiding this comment.
Thanks @xudong963 for revivew,
Good question! The overhead isn't from the transform logic itself, but from the transform() machinery setup - it creates closures, iterators, and Transformed wrapper objects even for leaf nodes.
For 41 columns × 5-7 MVs × every query, these small costs add up. The fast path is just a HashMap lookup + clone.
There was a problem hiding this comment.
I also added the comments to PR in latest commit.
There was a problem hiding this comment.
And i believe the improvement is coming more from another improvement which cached the Repeated columns() calls.
|
Approved now, sorry for the late |
Optimize
rewrite_fromperformance for wide tablesSummary
This PR optimizes
SpjNormalForm::rewrite_fromby reducing redundant allocations and avoiding expensive tree traversals for simple column expressions. Benchmarks show 44-75% improvement depending on column count.Problem
Flamegraph analysis revealed two bottlenecks in
rewrite_from:Repeated
columns()calls:DFSchema::columns()creates a newVecon each call. The original code called it inside the loop, causing O(n²) allocations for n columns.Unnecessary tree transforms: Every output expression went through
normalize_expr+rewrite, even simpleColumnexpressions that only need an O(1) HashMap lookup.Solution
1. Cache
columns()result2. Fast path for Column expressions
Benchmark Results
The improvement scales with column count because we're reducing O(n²) → O(n) complexity.
Changes
rewrite_from: Cacheoutput_schema.columns()outside looprewrite_from: Add fast path forExpr::Columnvariantsfind_output_column: New helper method for column lookupnormalize_column: New O(1) column normalization methodnormalize_expr: Add fast path for simple Column expressionsTesting
test_normalize_column_fast_path- verifies equivalence class normalizationtest_rewrite_from_with_many_columns- verifies wide table handling