Skip to content

[BUG] NoClassDefFoundError when attempting to read from Vertica #559

@padraic-mcatee

Description

@padraic-mcatee

Environment

  • Spark version: 3.5.0
  • Hadoop version: 3.3.6
  • Vertica version: 11
  • Vertica Spark Connector version: 3.3.5
  • Java version: 8
  • Additional Environment Information:
    • EMR 7.0.0

Problem Description

Missing class def. I see vertica-spark is on spark 3.3 - possibly some deprecation there?

  1. Steps to reproduce:
  2. Expected behaviour:
  3. Actual behaviour:
  4. Error message/stack trace:
py4j.protocol.Py4JJavaError: An error occurred while calling o254.createOrReplace.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3099 in stage 0.0 failed 4 times, most recent failure: Lost task 3099.3 in stage 0.0 (TID 3281) ([2600:1f18:41ad:2102:1022:9d72:bf0:2463] executor 102): java.lang.NoClassDefFoundError: org/apache/spark/sql/internal/SQLConf$LegacyBehaviorPolicy$
	at com.vertica.spark.datasource.fs.HadoopFileStoreLayer.openReadParquetFile(FileStoreLayerInterface.scala:380)
	at com.vertica.spark.datasource.core.VerticaDistributedFilesystemReadPipe.$anonfun$startPartitionRead$2(VerticaDistributedFilesystemReadPipe.scala:429)
	at scala.util.Either.flatMap(Either.scala:341)
	at com.vertica.spark.datasource.core.VerticaDistributedFilesystemReadPipe.startPartitionRead(VerticaDistributedFilesystemReadPipe.scala:416)
	at com.vertica.spark.datasource.core.DSReader.openRead(DSReader.scala:65)
	at com.vertica.spark.datasource.v2.VerticaBatchReader.<init>(VerticaDatasourceV2Read.scala:273)
	at com.vertica.spark.datasource.v2.VerticaReaderFactory.createReader(VerticaDatasourceV2Read.scala:261)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.advanceToNextIter(DataSourceRDD.scala:84)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:35)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.hasNext(Unknown Source)
	at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
	at org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.$anonfun$run$1(WriteToDataSourceV2Exec.scala:441)
	at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1409)
	at org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.run(WriteToDataSourceV2Exec.scala:486)
	at org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.run$(WriteToDataSourceV2Exec.scala:425)
	at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:491)
	at org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.$anonfun$writeWithV2$2(WriteToDataSourceV2Exec.scala:388)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
	at org.apache.spark.scheduler.Task.run(Task.scala:143)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:629)
	at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
	at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:95)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:632)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.internal.SQLConf$LegacyBehaviorPolicy$
	... 32 more
  1. Code sample or example on how to reproduce the issue:

Spark Connector Logs

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions