-
Notifications
You must be signed in to change notification settings - Fork 278
Closed
Labels
Description
Summary
In Spark 4.1.0, the ParquetColumnVector constructor signature changed - the memoryMode parameter was removed.
Details
- Spark Version: 4.1.0
- Change Type: Constructor signature change
Spark 3.5.0-4.0.x (7 arguments)
new ParquetColumnVector(
column: ParquetColumn,
vector: WritableColumnVector,
capacity: Int,
memoryMode: MemoryMode, // <-- This parameter
missingColumns: java.util.Set[ParquetColumn],
isTopLevel: Boolean,
defaultValue: Any
)Spark 4.1.0+ (6 arguments)
new ParquetColumnVector(
column: ParquetColumn,
vector: WritableColumnVector,
capacity: Int,
// memoryMode removed
missingColumns: java.util.Set[ParquetColumn],
isTopLevel: Boolean,
defaultValue: Any
)Impact
Code that creates ParquetColumnVector with 7 arguments will fail to compile:
too many arguments (found 7, expected 6) for constructor ParquetColumnVector
Affected Files
sql-plugin/src/main/spark350/scala/org/apache/spark/sql/execution/datasources/parquet/rapids/shims/ParquetCVShims.scala
Solution
Create version-specific ParquetCVShims:
For Spark 3.5.0-4.0.x (spark350/):
def newParquetCV(..., memoryMode: MemoryMode, ...): ParquetColumnVector = {
new ParquetColumnVector(column, vector, capacity, memoryMode, missingColumns, isTopLevel, defaultValue)
}For Spark 4.1.0+ (spark410/):
def newParquetCV(..., missingColumns, ...): ParquetColumnVector = {
// No memoryMode parameter
new ParquetColumnVector(column, vector, capacity, missingColumns, isTopLevel, defaultValue)
}References
- Spark 4.0.1: 7-arg constructor with memoryMode
- Spark 4.1.0: 6-arg constructor, memoryMode removed
Reactions are currently unavailable