Optimize PipeTable parsing: O(n²) → O(n) for 3.7x–85x speedup, enables 10K+ row tables #922
+425
−139
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR fundamentally rearchitects the
PipeTableParserto use a flat sibling structure instead of a deeply nested tree, reducing time complexity from O(n²) to O(n) for large tables.The Problem
How the Old Parser Worked
The original parser allowed pipe delimiters to nest content as children. For a simple table like:
The inline tree structure was deeply nested:
Depth = O(n) where n = number of cells
Why This Was Problematic
O(n²) Cell Boundary Detection: To find cell boundaries, the parser walked up the parent chain from each delimiter. With n delimiters nested n-deep, this required O(n²) operations.
Stack Overflow on Large Tables: .NET's default stack depth limit caused tables with 1000+ rows to crash with
DepthLimitExceededException.Quadratic Time Scaling:
Large Tables Simply Failed: 5000+ row tables couldn't be parsed at all.
The Solution
Flat Sibling Structure
By setting
IsClosed = trueonPipeTableDelimiterInline, subsequent content becomes siblings rather than children:Now produces a flat structure:
Depth = O(1) constant
Cell Boundary Detection
Finding cell content is now a simple sibling walk:
Handling Nested Pipes
Pipes can still end up nested inside unmatched emphasis:
The
PromoteNestedPipesToRootLevelmethod detects and promotes these:Benchmarks
Baseline Results (Before)
❌ = Failed with depth limit exceeded
Current Results (After)
Performance Improvement
Memory Improvement
Scaling Verification (Linear)
Time per row is nearly constant, confirming O(n) complexity.
Breaking Changes
None. The output AST is identical; only the internal parsing strategy changed.
Test Results
All 3,595 existing tests pass.