Skip to content

feat: add CTE (WITH clause) support to SQL executor#320

Open
DidelotK wants to merge 3 commits intoeddiethedean:mainfrom
DidelotK:feat/add-cte-support
Open

feat: add CTE (WITH clause) support to SQL executor#320
DidelotK wants to merge 3 commits intoeddiethedean:mainfrom
DidelotK:feat/add-cte-support

Conversation

@DidelotK
Copy link
Contributor

Summary

Add support for Common Table Expressions (CTEs) in SQL queries, enabling queries like:

WITH filtered_data AS (
    SELECT id, value FROM source_table WHERE value > 75
)
SELECT * FROM filtered_data

Changes

  • Parser (parser.py):

    • Added WITH detection in _detect_query_type() (before UNION check)
    • Added routing for WITH in _parse_components()
    • Added _parse_with_query() method that parses CTE definitions using balanced parenthesis counting
  • Executor (executor.py):

    • Added WITH case in execute() method
    • Added _execute_with() method that executes CTEs and stores results in temp views
    • Modified table loading to check temp views before session.table() lookup

Features Supported

  • Single CTE: WITH cte AS (...) SELECT * FROM cte
  • Multiple CTEs: WITH cte1 AS (...), cte2 AS (...) SELECT * FROM cte2
  • Chained CTEs (CTE referencing another CTE)
  • CTEs with filtering, aggregation, and column selection
  • Case-insensitive WITH keyword
  • Proper cleanup of temp views after execution

Test Plan

  • 11 new unit tests covering all CTE scenarios
  • All 73 session tests pass
  • Tested: simple CTE, multiple CTEs, CTE chains, filtering, aggregation, case insensitivity, whitespace handling, parser detection

Known Limitations

  • Pre-existing sparkless limitations apply (arithmetic in SELECT, complex AND conditions in WHERE)
  • These are not regressions from this PR

🤖 Generated with Claude Code

Add support for Common Table Expressions (CTEs) in SQL queries, enabling
queries like:
  WITH cte AS (SELECT * FROM table WHERE x > 0)
  SELECT * FROM cte

Implementation:
- Parser: detect WITH queries, parse CTE definitions with balanced parenthesis
- Executor: execute CTEs and store in temp views, then execute main query
- Table loading: check temp views before session.table() lookup

Features:
- Single and multiple CTEs in one query
- Chained CTEs (CTE referencing another CTE)
- CTEs with filtering, aggregation, and column selection
- Case-insensitive WITH keyword
- Proper cleanup of temp views after execution

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@DidelotK DidelotK marked this pull request as draft January 22, 2026 21:48
DidelotK and others added 2 commits January 22, 2026 22:55
- Remove unused `pos` variable in parser.py
- Remove unused imports (cast, pytest, DataFrame) in test_sql_cte.py

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant