Skip to content

Commit f5d6dc3

Browse files
authored
feat(parquet): add sparse-column writer benchmarks (#9654)
# Which issue does this PR close? - None, but relates to #9652 # Rationale for this change Measure sparse and all-null cases in benchmarks. # What changes are included in this PR? Add three new benchmark cases to the arrow_writer benchmark suite for evaluating write performance on sparse and all-null data: - `primitive_sparse_99pct_null`: a flat primitive column with 99% nulls, exercising long RLE runs in definition levels. - `list_primitive_sparse_99pct_null`: a list-of-primitive column with 99% nulls, exercising null batching in the list level builder. - `primitive_all_null`: a flat primitive column with 100% nulls, exercising the uniform_levels fast path for entirely-null columns. # Are these changes tested? N/A # Are there any user-facing changes? None. Signed-off-by: Hippolyte Barraud <hippolyte.barraud@datadoghq.com>
1 parent c6ea0a5 commit f5d6dc3

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

parquet/benches/arrow_writer.rs

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -391,6 +391,15 @@ fn create_batches() -> Vec<(&'static str, RecordBatch)> {
391391
let batch = create_list_primitive_bench_batch_non_null(BATCH_SIZE, 0.25, 0.75).unwrap();
392392
batches.push(("list_primitive_non_null", batch));
393393

394+
let batch = create_primitive_bench_batch(BATCH_SIZE, 0.99, 0.75).unwrap();
395+
batches.push(("primitive_sparse_99pct_null", batch));
396+
397+
let batch = create_list_primitive_bench_batch(BATCH_SIZE, 0.99, 0.75).unwrap();
398+
batches.push(("list_primitive_sparse_99pct_null", batch));
399+
400+
let batch = create_primitive_bench_batch(BATCH_SIZE, 1.0, 0.75).unwrap();
401+
batches.push(("primitive_all_null", batch));
402+
394403
batches
395404
}
396405

0 commit comments

Comments
 (0)