Skip to content

Conversation

@ilxqx
Copy link
Contributor

@ilxqx ilxqx commented Dec 19, 2025

  • Changed DML parsers (INSERT, UPDATE, DELETE, MERGE) to use isIdentifier() instead of isType(TokenTypeIdentifier) for table/column name parsing
  • Changed DDL parsers (CREATE, DROP, TRUNCATE, REFRESH) similarly
  • Added comprehensive tests for double-quoted identifiers in all DML/DDL statements

The existing isIdentifier() method correctly handles both regular identifiers and double-quoted strings (TokenTypeDoubleQuotedString), but the DML/DDL parsers were bypassing this by checking for TokenTypeIdentifier directly.

This fix ensures ANSI SQL double-quoted identifiers work consistently across all statement types, matching the existing behavior in SELECT.

Fixes #200

Description

Brief description of what this PR does.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Performance improvement
  • Documentation update
  • Code refactoring
  • Test improvement

Changes Made

  • List the specific changes made
  • Include any relevant implementation details
  • Mention any new dependencies added

Testing

  • Unit tests pass (go test ./...)
  • Race condition tests pass (go test -race ./...)
  • Benchmarks run successfully (go test -bench=. ./...)
  • Added new tests for new functionality
  • All existing tests still pass

Test Coverage

  • Current coverage: XX%
  • New coverage: XX%

Performance Impact

  • No performance impact
  • Performance improvement (include benchmark results)
  • Minor performance degradation (justified by feature benefit)

Benchmark Results (if applicable)

// Include relevant benchmark comparisons

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published

Related Issues

Closes #(issue number)

Screenshots (if applicable)

Add any relevant screenshots here.

Additional Notes

Any additional information that reviewers should know.

- Changed DML parsers (INSERT, UPDATE, DELETE, MERGE) to use isIdentifier()
  instead of isType(TokenTypeIdentifier) for table/column name parsing
- Changed DDL parsers (CREATE, DROP, TRUNCATE, REFRESH) similarly
- Added comprehensive tests for double-quoted identifiers in all DML/DDL statements

The existing isIdentifier() method correctly handles both regular identifiers
and double-quoted strings (TokenTypeDoubleQuotedString), but the DML/DDL
parsers were bypassing this by checking for TokenTypeIdentifier directly.

This fix ensures ANSI SQL double-quoted identifiers work consistently
across all statement types, matching the existing behavior in SELECT.

Fixes ajitpratap0#200
@ajitpratap0
Copy link
Owner

@claude review

@claude
Copy link

claude bot commented Jan 4, 2026

Claude encountered an error —— View job

Failed with exit code 128

I'll analyze this and get back to you.

1 similar comment
@claude
Copy link

claude bot commented Jan 4, 2026

Claude encountered an error —— View job

Failed with exit code 128

I'll analyze this and get back to you.

@ajitpratap0
Copy link
Owner

PR Review: Double-Quoted Identifier Support

Thanks for this fix! The core approach is correct - using isIdentifier() instead of isType(TokenTypeIdentifier) is the right solution. However, I've identified some gaps that should be addressed for complete ANSI SQL quoted identifier support.


🔴 High Priority: Missing Changes in cte.go

The PR doesn't update CTE parsing, so double-quoted CTE names and columns will still fail:

-- These will fail with current PR:
WITH "reserved-word" AS (SELECT 1) SELECT * FROM "reserved-word";
WITH cte ("column") AS (SELECT 1) SELECT * FROM cte;

Please update these locations in pkg/sql/parser/cte.go:

  • Line 112: CTE name parsing - change isType(models.TokenTypeIdentifier) to isIdentifier()
  • Line 124: CTE column list parsing - same change

🟡 Medium Priority: Missing Changes in select.go

Constraint/reference parsing still uses isType(TokenTypeIdentifier). Please review and update if appropriate:

  • Line 171: REFERENCES table name
  • Line 183: REFERENCES column list
  • Line 329: FOREIGN KEY table reference
  • Line 401: Constraint column list

If these are intentionally not supporting quoted identifiers, please document why.


🟡 Medium Priority: Missing Test Coverage

The test file is a good start but has significant gaps:

Missing Tests Code Changes
MERGE statements 6 locations modified, 0 tests
REFRESH MATERIALIZED VIEW Modified but not tested
CREATE MATERIALIZED VIEW Modified but not tested
ON CONFLICT with quoted identifiers Modified but not tested

Please add tests for:

// MERGE with quoted identifiers
`MERGE INTO "target" t USING "source" s ON t.id = s.id WHEN MATCHED THEN UPDATE SET "col" = s.val`

// REFRESH MATERIALIZED VIEW
`REFRESH MATERIALIZED VIEW "my-view"`

// ON CONFLICT
`INSERT INTO "users" ("id") VALUES (1) ON CONFLICT ("id") DO UPDATE SET "name" = EXCLUDED."name"`

🟡 Test Quality Improvements

The tests should follow GoSQLX patterns more closely:

  1. Add AST release - Currently no ast.ReleaseAST() calls (potential memory leak):
tree, err := parser.Parse(tokens)
if err != nil {
    t.Fatalf(...)
}
defer ast.ReleaseAST(tree)  // Add this
  1. Use NewParser() - Replace &Parser{} with proper pooling:
parser := NewParser()
defer parser.Release()
  1. Add AST validation - Verify the quoted identifier is correctly stored:
// After parsing INSERT INTO "users" ...
if insertStmt.TableName != "users" {  // or `"users"` depending on how it's stored
    t.Errorf("expected table name 'users', got '%s'", insertStmt.TableName)
}
  1. Add edge case tests:
// Spaces in identifier
`SELECT * FROM "table name"`

// Reserved word as identifier  
`SELECT "select", "from" FROM users`

// Schema-qualified with quotes
`SELECT * FROM "schema"."table"`

Summary

Category Status
dml.go changes ✅ Correct
ddl.go changes ✅ Correct
cte.go ❌ Missing updates
select.go ⚠️ Review needed
Test coverage ⚠️ Gaps in MERGE, REFRESH, ON CONFLICT
Test quality ⚠️ Missing AST release, validation

Please address the cte.go changes and add MERGE tests at minimum before merging. The other items can be follow-up improvements if needed.

Address PR review feedback:

1. cte.go: Use isIdentifier() for CTE name and column parsing
   - Line 112: CTE name now supports double-quoted identifiers
   - Line 124: CTE column list now supports double-quoted identifiers

2. select.go: Use isIdentifier() for constraint/reference parsing
   - Line 171: REFERENCES table name
   - Line 183: REFERENCES column list
   - Line 329: FOREIGN KEY table reference
   - Line 401: Constraint column list (parseConstraintColumnList)

3. Improved test quality in double_quoted_identifier_test.go:
   - Use NewParser()/Release() instead of &Parser{}
   - Add ast.ReleaseAST() to prevent memory leaks
   - Add new test cases for CTE, MERGE, REFRESH MATERIALIZED VIEW,
     ON CONFLICT, and edge cases
   - Fix token converter to handle asterisk (*) correctly

All tests pass including race detection.
@ajitpratap0 ajitpratap0 merged commit 42708a7 into ajitpratap0:main Jan 11, 2026
19 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Double-quoted identifiers not supported in non-SELECT statements (UPDATE/INSERT/DELETE/DROP/TRUNCATE/CREATE)

3 participants