feat(distributed): Distributed PPL query engine with refactored task scheduler#3
Merged
vamsimanohar merged 3 commits intomainfrom Feb 25, 2026
Merged
Conversation
c1152b1 to
773a9d1
Compare
…eline Implement a distributed MPP query engine for PPL that executes queries across multiple OpenSearch nodes in parallel using direct Lucene access. Key components: - DistributedExecutionEngine: routes queries between legacy and distributed paths - DistributedQueryPlanner: converts Calcite RelNode trees to multi-stage plans - DistributedTaskScheduler: coordinates operator pipeline across cluster nodes - TransportExecuteDistributedTaskAction: executes pipelines on data nodes - LuceneScanOperator/LimitOperator: direct Lucene _source reads per shard - Coordinator-side Calcite execution for complex operations (stats, eval, joins) - Hash join support with parallel distributed table scans - Filter pushdown, sort, rename, and limit in operator pipeline - Phase 5A core operator framework (Page, Pipeline, ComputeStage, StagedPlan) - Explain API showing distributed plan stages via _plugins/_ppl/_explain - Architecture documentation with class hierarchy and execution plan details - Comprehensive test coverage including integration tests Architecture: two execution paths controlled by plugins.ppl.distributed.enabled - Legacy (off): existing Calcite-based OpenSearchExecutionEngine - Distributed (on): operator pipeline with no fallback
773a9d1 to
0d23c18
Compare
- Rename Split → DataUnit (abstract class), SplitSource → DataUnitSource, SplitAssignment → DataUnitAssignment - Add Block interface (columnar, Arrow-aligned) - Add PlanFragmenter, FragmentationContext, SubPlan for automatic stage creation - Add OutputBuffer for exchange back-pressure - Add execution lifecycle: QueryExecution, StageExecution, TaskExecution - Add planFragment field to ComputeStage for query pushdown - Extend Page with getBlock() and getRetainedSizeBytes() defaults - Create OpenSearchDataUnit (index + shard, not remotely accessible) - Delete H1 types: DistributedPhysicalPlan, ExecutionStage, WorkUnit, DataPartition, DistributedQueryPlanner, DistributedPlanAnalyzer, RelNodeAnalysis, PartitionDiscovery - Delete execution code: DistributedTaskScheduler, HashJoinExecutor, InMemoryScannableTable, QueryResponseBuilder, TemporalValueNormalizer, RelNodeAnalyzer, FieldMapping, JoinInfo, SortKey, OpenSearchPartitionDiscovery - Gut DistributedExecutionEngine to routing shell (throws when enabled) - Simplify OpenSearchPluginModule constructor - Default PPL_DISTRIBUTED_ENABLED to false - Remove assumeFalse(isDistributedEnabled()) from integ tests - Update architecture documentation
…vent infinite loop The processOnce() loop only passed output between adjacent operator pairs (i to i+1), never calling getOutput() on the last operator. Operators that buffer pages (e.g., PassThroughOperator) would never have their buffer drained, causing isFinished() to never return true and an infinite loop in run().
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR implements a complete distributed PPL query engine for OpenSearch and refactors the core task scheduler for maintainability.
Distributed Query Engine (Phases 1A-5B)
LuceneScanOperatorreading directly from Lucene_sourceExecuteDistributedTaskActionTask Scheduler Refactoring
Refactored
DistributedTaskScheduler.javafrom 2087 → ~706 lines by extracting 8 focused classes:SortKey— sort direction/field value objectFieldMapping— column name/type/index mappingJoinInfo— join metadata from RelNode analysisInMemoryScannableTable— Calcite ScannableTable for in-memory dataTemporalValueNormalizer— normalizes all OpenSearch date formats (compact, ordinal, week, epoch, ISO, T-separated, fractional seconds)RelNodeAnalyzer— extracts fields/limits/sorts/filters/joins from Calcite RelNode treesHashJoinExecutor— hash join on coordinator with build/probe phasesQueryResponseBuilder— builds ExprValue QueryResponse from JDBC ResultSet with UDT type conversionDead Code Removal
DistributedPhysicalPlanner,TaskOperator,DistributedQueryPlannerTestDistributedQueryPlannerto standalone files:PartitionDiscovery,RelNodeAnalysis,DistributedPlanAnalyzertaskOperatorfield fromWorkUnitTransportExecuteDistributedTaskAction(only OPERATOR_PIPELINE path remains)ExecuteDistributedTaskRequest(removedworkUnits,searchSourceBuilder,inputData)Architecture Documentation
docs/distributed-engine-architecture.mdwith class hierarchy, module layout, execution flow diagrams, and typical plan examplesIntegration Test Fixes
isDistributedEnabled()test helper withassumeFalseguards for known limitations:Key Configuration
plugins.ppl.distributed.enabled— single setting, now defaults totrueTest plan
./gradlew :integ-test:integTest --tests "org.opensearch.sql.calcite.remote.CalcitePPL*"to verify