forked from delta-io/delta
-
Notifications
You must be signed in to change notification settings - Fork 0
Integrate server-side planning into DeltaCatalog with feature flag #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
murali-db
wants to merge
3
commits into
server-side-planning-3-dsv2-table-impl
Choose a base branch
from
server-side-planning-4-catalog-integration
base: server-side-planning-3-dsv2-table-impl
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Integrate server-side planning into DeltaCatalog with feature flag #5
murali-db
wants to merge
3
commits into
server-side-planning-3-dsv2-table-impl
from
server-side-planning-4-catalog-integration
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
murali-db
added a commit
that referenced
this pull request
Nov 2, 2025
This PR adds comprehensive integration tests that validate the entire server-side planning stack from DeltaCatalog through to data reading. Test Coverage: - Full stack integration: DeltaCatalog → ServerSidePlannedTable → Client → Data - SELECT query execution through server-side planning path - Aggregation queries (SUM, COUNT, GROUP BY) - Verification that normal path is unaffected when feature disabled Test Strategy: 1. Enable DeltaCatalog as Spark catalog 2. Create Parquet tables with test data 3. Enable forceServerSidePlanning flag 4. Configure ServerSidePlanningTestClientFactory 5. Execute queries and verify results 6. Verify scan plan discovered files Test Cases: - E2E full stack integration with SELECT query - E2E aggregation query (SUM, COUNT, GROUP BY) - Normal path verification (feature disabled) Assertions: - Query results are correct - Files are discovered via server-side planning - Aggregations produce correct values - Normal table loading works when feature disabled This completes the test pyramid: - PR #1: Test infrastructure (REST server) - PR #2: Client unit tests - PR #3: DSv2 Table unit and integration tests - PR #4: DeltaCatalog integration (no new tests - minimal change) - PR #5: Full stack E2E integration tests (this PR) All functionality is now fully tested from unit to integration level.
5 tasks
5b5e717 to
fd30578
Compare
1fb4a24 to
e51a537
Compare
murali-db
added a commit
that referenced
this pull request
Nov 3, 2025
This PR adds comprehensive integration tests that validate the entire server-side planning stack from DeltaCatalog through to data reading. Test Coverage: - Full stack integration: DeltaCatalog → ServerSidePlannedTable → Client → Data - SELECT query execution through server-side planning path - Aggregation queries (SUM, COUNT, GROUP BY) - Verification that normal path is unaffected when feature disabled Test Strategy: 1. Enable DeltaCatalog as Spark catalog 2. Create Parquet tables with test data 3. Enable forceServerSidePlanning flag 4. Configure ServerSidePlanningTestClientFactory 5. Execute queries and verify results 6. Verify scan plan discovered files Test Cases: - E2E full stack integration with SELECT query - E2E aggregation query (SUM, COUNT, GROUP BY) - Normal path verification (feature disabled) Assertions: - Query results are correct - Files are discovered via server-side planning - Aggregations produce correct values - Normal table loading works when feature disabled This completes the test pyramid: - PR #1: Test infrastructure (REST server) - PR #2: Client unit tests - PR #3: DSv2 Table unit and integration tests - PR #4: DeltaCatalog integration (no new tests - minimal change) - PR #5: Full stack E2E integration tests (this PR) All functionality is now fully tested from unit to integration level.
fd30578 to
3abb6ec
Compare
e51a537 to
169526b
Compare
murali-db
added a commit
that referenced
this pull request
Nov 3, 2025
This PR adds comprehensive integration tests that validate the entire server-side planning stack from DeltaCatalog through to data reading. Test Coverage: - Full stack integration: DeltaCatalog → ServerSidePlannedTable → Client → Data - SELECT query execution through server-side planning path - Aggregation queries (SUM, COUNT, GROUP BY) - Verification that normal path is unaffected when feature disabled Test Strategy: 1. Enable DeltaCatalog as Spark catalog 2. Create Parquet tables with test data 3. Enable forceServerSidePlanning flag 4. Configure ServerSidePlanningTestClientFactory 5. Execute queries and verify results 6. Verify scan plan discovered files Test Cases: - E2E full stack integration with SELECT query - E2E aggregation query (SUM, COUNT, GROUP BY) - Normal path verification (feature disabled) Assertions: - Query results are correct - Files are discovered via server-side planning - Aggregations produce correct values - Normal table loading works when feature disabled This completes the test pyramid: - PR #1: Test infrastructure (REST server) - PR #2: Client unit tests - PR #3: DSv2 Table unit and integration tests - PR #4: DeltaCatalog integration (no new tests - minimal change) - PR #5: Full stack E2E integration tests (this PR) All functionality is now fully tested from unit to integration level.
3abb6ec to
d9a5b07
Compare
169526b to
2db6ed4
Compare
murali-db
added a commit
that referenced
this pull request
Nov 3, 2025
This PR adds comprehensive integration tests that validate the entire server-side planning stack from DeltaCatalog through to data reading. Test Coverage: - Full stack integration: DeltaCatalog → ServerSidePlannedTable → Client → Data - SELECT query execution through server-side planning path - Aggregation queries (SUM, COUNT, GROUP BY) - Verification that normal path is unaffected when feature disabled Test Strategy: 1. Enable DeltaCatalog as Spark catalog 2. Create Parquet tables with test data 3. Enable forceServerSidePlanning flag 4. Configure ServerSidePlanningTestClientFactory 5. Execute queries and verify results 6. Verify scan plan discovered files Test Cases: - E2E full stack integration with SELECT query - E2E aggregation query (SUM, COUNT, GROUP BY) - Normal path verification (feature disabled) Assertions: - Query results are correct - Files are discovered via server-side planning - Aggregations produce correct values - Normal table loading works when feature disabled This completes the test pyramid: - PR #1: Test infrastructure (REST server) - PR #2: Client unit tests - PR #3: DSv2 Table unit and integration tests - PR #4: DeltaCatalog integration (no new tests - minimal change) - PR #5: Full stack E2E integration tests (this PR) All functionality is now fully tested from unit to integration level.
d9a5b07 to
755e14b
Compare
2db6ed4 to
a624bdf
Compare
murali-db
added a commit
that referenced
this pull request
Nov 3, 2025
This PR adds comprehensive integration tests that validate the entire server-side planning stack from DeltaCatalog through to data reading. Test Coverage: - Full stack integration: DeltaCatalog → ServerSidePlannedTable → Client → Data - SELECT query execution through server-side planning path - Aggregation queries (SUM, COUNT, GROUP BY) - Verification that normal path is unaffected when feature disabled Test Strategy: 1. Enable DeltaCatalog as Spark catalog 2. Create Parquet tables with test data 3. Enable forceServerSidePlanning flag 4. Configure ServerSidePlanningTestClientFactory 5. Execute queries and verify results 6. Verify scan plan discovered files Test Cases: - E2E full stack integration with SELECT query - E2E aggregation query (SUM, COUNT, GROUP BY) - Normal path verification (feature disabled) Assertions: - Query results are correct - Files are discovered via server-side planning - Aggregations produce correct values - Normal table loading works when feature disabled This completes the test pyramid: - PR #1: Test infrastructure (REST server) - PR #2: Client unit tests - PR #3: DSv2 Table unit and integration tests - PR #4: DeltaCatalog integration (no new tests - minimal change) - PR #5: Full stack E2E integration tests (this PR) All functionality is now fully tested from unit to integration level.
755e14b to
673a081
Compare
a624bdf to
1a7895e
Compare
murali-db
added a commit
that referenced
this pull request
Nov 3, 2025
This PR adds comprehensive integration tests that validate the entire server-side planning stack from DeltaCatalog through to data reading. Test Coverage: - Full stack integration: DeltaCatalog → ServerSidePlannedTable → Client → Data - SELECT query execution through server-side planning path - Aggregation queries (SUM, COUNT, GROUP BY) - Verification that normal path is unaffected when feature disabled Test Strategy: 1. Enable DeltaCatalog as Spark catalog 2. Create Parquet tables with test data 3. Enable forceServerSidePlanning flag 4. Configure ServerSidePlanningTestClientFactory 5. Execute queries and verify results 6. Verify scan plan discovered files Test Cases: - E2E full stack integration with SELECT query - E2E aggregation query (SUM, COUNT, GROUP BY) - Normal path verification (feature disabled) Assertions: - Query results are correct - Files are discovered via server-side planning - Aggregations produce correct values - Normal table loading works when feature disabled This completes the test pyramid: - PR #1: Test infrastructure (REST server) - PR #2: Client unit tests - PR #3: DSv2 Table unit and integration tests - PR #4: DeltaCatalog integration (no new tests - minimal change) - PR #5: Full stack E2E integration tests (this PR) All functionality is now fully tested from unit to integration level.
673a081 to
91c4414
Compare
1a7895e to
bc47741
Compare
91c4414 to
9d514e9
Compare
bc47741 to
7191318
Compare
9d514e9 to
6a25ba6
Compare
7191318 to
0463f26
Compare
murali-db
added a commit
that referenced
this pull request
Nov 4, 2025
Changes made: - Rename IcebergServerSidePlanningClient to IcebergRESTCatalogPlanningClient - Rename IcebergServerSidePlanningClientFactory to IcebergRESTCatalogPlanningClientFactory - Remove schema field from ScanPlan (not in Iceberg REST API spec) - Remove partitionData field from ScanFile and add validation - Rename parameter 'namespace' to 'database' throughout - Remove FORCE_SERVER_SIDE_PLANNING config (moved to PR #5) - Simplify ServerSidePlanningTestClient (remove cloning/config logic) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
6a25ba6 to
819df83
Compare
0463f26 to
1ee15f2
Compare
819df83 to
47e6fa9
Compare
Adds catalog integration to enable server-side planning: - DeltaCatalog now checks for spark.delta.serverSidePlanning.enabled config - When enabled, returns ServerSidePlannedTable instead of DeltaTableV2 - Uses ServerSidePlanningClientFactory to create appropriate client - Supports both Iceberg REST and mock/test implementations This completes the plumbing from catalog through DSv2 to client. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Update DeltaCatalog.loadTable() to use buildForCatalog instead of createClient, extracting the catalog name from the identifier's namespace. This enables proper catalog-specific configuration reading from spark.sql.catalog.<catalogName>.uri and token. For fully qualified identifiers like catalog.database.table, the catalog name is extracted from the first element of the namespace. Otherwise, defaults to "spark_catalog". Also adds ENABLE_SERVER_SIDE_PLANNING config to DeltaSQLConf (renamed from FORCE_SERVER_SIDE_PLANNING to better reflect its purpose). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1ee15f2 to
ca46cc0
Compare
murali-db
added a commit
that referenced
this pull request
Nov 11, 2025
Addresses code review issues #5, #7, #9, and #18: 1. Issue #7: Expand Hadoop configuration limitation documentation - Clarify production impact of using sessionState.newHadoopConf() - Provide concrete examples of what won't work - Document workaround for per-query credentials - Link to architectural decision about DeltaLog dependency 2. Issue #18: Document DeltaLog dependency avoidance - Add comprehensive class-level documentation for ServerSidePlannedTable - Explain format independence, lightweight design, and clean architecture - Document trade-offs and alternative approaches 3. Issue #9: Improve catalog name extraction documentation - Add detailed examples for all identifier formats - Explain edge cases (fully qualified vs database-only vs table-only) - Clarify why we check namespace().length > 1 4. Issue #5: Add TODO for hasCredentials() test coverage - Document what test would be helpful to add - Suggest implementation approaches (reflection vs custom catalog) - Note challenge of testing without real Unity Catalog All tests pass. No functional changes, only documentation improvements. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack:
This PR wires ServerSidePlannedTable into DeltaCatalog's loadTable() method, enabling the fallback path when Unity Catalog tables lack credentials.
Changes to DeltaCatalog:
Logic Flow:
Feature Flag:
Credential Detection:
Safety:
This is the minimal integration point - actual usage requires setting the force flag or having UC tables without credentials.
Which Delta project/connector is this regarding?
Description
How was this patch tested?
Does this PR introduce any user-facing changes?