Skip to content

[VL] Add metrics for lazy vector load#10726

Merged
FelixYBW merged 2 commits intoapache:mainfrom
marin-ma:load-lazy-metrics
Sep 26, 2025
Merged

[VL] Add metrics for lazy vector load#10726
FelixYBW merged 2 commits intoapache:mainfrom
marin-ma:load-lazy-metrics

Conversation

@marin-ma
Copy link
Contributor

@marin-ma marin-ma commented Sep 16, 2025

By default, velox enabled lazy read. Only the columns used by filter are decoded in table scan operator. The other columnes are saved as lazy vector. Data Fetch, decompression and decoder is triggered only when these columns are flatten. So part of the table scan time is actually counted into next operator.

While in WholeStageResultIterator::nextInternal function, we flatten the vector before it's returned to Spark which trigger the lazy read operator. The time is missing in our tracker.

The PR add metrics "time of loading lazy vectors" in NextInternal function to the last operator of the transformer.

image

@github-actions github-actions bot added the VELOX label Sep 16, 2025
@FelixYBW
Copy link
Contributor

@marin-ma could you rebase?

@FelixYBW FelixYBW merged commit 89c4182 into apache:main Sep 26, 2025
99 of 101 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants