Skip to content

[VL] Read OrcFile error when schema in Orc file and the table file don't consist #5638

@liujp

Description

@liujp

Backend

VL (Velox)

Bug description

when the schema in Orc file is different from the table schema(especially the field name), the gluten read error. as follows:
schema in ORC file:
image
the user table is:
CREATE TABLE parquet_test
(
treatment TINYINT,
numerator DOUBLE,
denominator TINYINT,
numerator_pre BIGINT,
denominator_pre TINYINT,
Y DOUBLE,
X1 INT,
X2 INT,
X3 INT,
X3_string STRING,
X7_needcut BIGINT,
X8_needcut BIGINT,
weight DOUBLE,
distance DOUBLE
)

then execute the query:
val df = spark.sql( "SELECT count(Y) from udf_table")
a incorrect value returns,
image

but use query: val df = spark.sql( "SELECT count(_col5) from udf_table")
return ok;
image

Spark version

None

Spark configurations

No response

System information

No response

Relevant logs

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingtriage

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions