Skip to content

Conversation

@pyramation
Copy link
Collaborator

@pyramation pyramation commented Jan 6, 2026

Summary

This PR fixes three bugs in the plpgsql-deparser that were causing invalid SQL output:

  1. PERFORM SELECT fix: The PL/pgSQL parser stores PERFORM statements internally as "SELECT ...", so the deparser was outputting invalid PERFORM SELECT ... syntax. Now strips the leading SELECT keyword.

  2. INTO clause depth-aware scanner: The old regex-based approach for inserting INTO clauses would incorrectly insert INTO inside subqueries. Replaced with a depth-aware scanner (~150 lines) that tracks parentheses, quotes, comments, and dollar quotes to find the correct insertion point only at depth 0.

  3. Record field qualification: The deparser was returning just fieldname instead of the full qualified reference like new.is_active. Now uses recparentno to look up the parent record and construct the full reference.

Updates since last revision

  • Whitespace normalization fix: The parser strips INTO <target> from the stored query but leaves the original whitespace behind. This was causing large gaps in output like SELECT x INTO y FROM z. Now normalizes leading whitespace after INTO insertion to a single space.

  • Round-trip AST testing: Upgraded all 17 test cases to use round-trip testing (parse → deparse → reparse → compare cleaned ASTs). Added normalizeQueryWhitespace to cleanPlpgsqlTree to handle indentation differences in embedded SQL queries while preserving content inside string literals and dollar-quoted strings.

Review & Testing Checklist for Human

  • Verify depth-aware scanner edge cases: The findIntoInsertionPoint function (lines 1090-1243) is complex state machine logic. Test with: nested subqueries, CTEs, UNION, dollar-quoted strings containing "FROM", comments containing keywords, escaped quotes
  • Test generated SQL against PostgreSQL: Round-trip tests verify AST equality but don't validate the SQL is actually executable. Run a few test cases against a real PostgreSQL instance
  • Review normalizeQueryWhitespace function: This function (~100 lines in test-utils/index.ts) normalizes whitespace for AST comparison. Verify it doesn't mask real semantic differences by being too aggressive
  • Verify record field qualification in triggers: Test with OLD/NEW references in various trigger scenarios

Recommended test plan:

  1. Run the new snapshot tests locally
  2. Take a few of the snapshot outputs and execute them as PL/pgSQL function bodies in a real PostgreSQL database to verify syntax validity
  3. Create a trigger function using OLD.field and NEW.field references and verify it parses/deparses correctly

Notes

This PR upstreams patches 1-3 from constructive-db PR #229. Patch 4 (RETURN statement handling) is being handled separately as it requires design discussion about passing return-type context.

Link to Devin run: https://app.devin.ai/sessions/8e1c971e9b194cd9a7dda034c89bd74b
Requested by: Dan Lynch (@pyramation)

- Fix PERFORM SELECT: Strip leading SELECT keyword since PERFORM replaces it
- Fix INTO clause insertion: Use depth-aware scanner to avoid inserting INTO inside subqueries
- Fix recfield qualification: Use recparentno to construct full qualified reference (e.g., new.is_active)

Add comprehensive snapshot tests for all three fixes.
@devin-ai-integration
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

The parser strips 'INTO <target>' from the query but leaves whitespace behind.
This fix normalizes the leading whitespace after the insertion point to avoid
large gaps like 'SELECT x INTO y                    FROM z'.
Upgrade deparser-fixes tests to use round-trip testing infrastructure:
- parse -> deparse -> reparse -> compare cleaned ASTs
- Add normalizeQueryWhitespace to cleanPlpgsqlTree to handle
  indentation differences in embedded SQL queries
- All 17 test cases now verify AST equality after round-trip
@pyramation pyramation merged commit f278def into main Jan 6, 2026
14 checks passed
devin-ai-integration bot pushed a commit that referenced this pull request Jan 6, 2026
… PR #229

Add 14 new PL/pgSQL fixtures that exercise the deparser fixes:
- PERFORM statement (parser stores as SELECT, deparser strips SELECT)
- INTO clause with record field target (recfield qualification)
- INTO clause with subquery (depth-aware scanner)
- SETOF function with RETURN QUERY and bare RETURN
- SETOF function with RETURN NEXT
- Void function with bare RETURN
- Scalar function with RETURN NULL
- OUT parameter function with bare RETURN
- RETURNS TABLE function with RETURN QUERY
- Trigger functions with complex logic
- Procedure (implicit void return)

These fixtures are inspired by real-world functions from constructive-db
that required the deparser fixes in PR #266.

Regenerated generated.json: 190 valid statements (up from 176)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants