Conversation
Merging this PR will degrade performance by 18.91%
Performance Changes
Comparing Footnotes
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) duckdb / vortex-file-compressed (1.030x ➖, 0↑ 1↓)
duckdb / vortex-compact (1.006x ➖, 0↑ 0↓)
duckdb / parquet (0.995x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: FineWeb NVMeVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.011x ➖, 1↑ 2↓)
datafusion / vortex-compact (1.025x ➖, 0↑ 1↓)
datafusion / parquet (1.045x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.992x ➖, 1↑ 0↓)
duckdb / vortex-compact (0.993x ➖, 0↑ 0↓)
duckdb / parquet (1.028x ➖, 0↑ 1↓)
Full attributed analysis
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (1.081x ➖, 0↑ 2↓)
datafusion / vortex-compact (0.890x ➖, 3↑ 0↓)
datafusion / parquet (1.070x ➖, 0↑ 2↓)
duckdb / vortex-file-compressed (0.909x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.919x ➖, 0↑ 0↓)
duckdb / parquet (0.952x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.019x ➖, 0↑ 4↓)
datafusion / vortex-compact (1.012x ➖, 0↑ 0↓)
datafusion / parquet (1.015x ➖, 0↑ 3↓)
duckdb / vortex-file-compressed (1.017x ➖, 0↑ 4↓)
duckdb / vortex-compact (1.011x ➖, 2↑ 4↓)
duckdb / parquet (1.012x ➖, 0↑ 4↓)
duckdb / duckdb (1.016x ➖, 0↑ 5↓)
Full attributed analysis
|
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.988x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.887x ➖, 2↑ 0↓)
datafusion / parquet (1.028x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (1.048x ➖, 0↑ 1↓)
duckdb / vortex-compact (1.021x ➖, 0↑ 0↓)
duckdb / parquet (0.973x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.011x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.963x ➖, 2↑ 0↓)
datafusion / parquet (1.008x ➖, 0↑ 0↓)
datafusion / arrow (1.008x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.012x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.006x ➖, 0↑ 0↓)
duckdb / parquet (1.009x ➖, 0↑ 0↓)
duckdb / duckdb (0.996x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: PolarSignals ProfilingVortex (geomean): 1.004x ➖ datafusion / vortex-file-compressed (1.004x ➖, 0↑ 0↓)
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.994x ➖, 1↑ 0↓)
datafusion / parquet (1.007x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (0.946x ➖, 8↑ 1↓)
duckdb / parquet (0.956x ➖, 2↑ 0↓)
duckdb / duckdb (0.931x ➖, 7↑ 0↓)
Full attributed analysis
|
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.891x ✅, 12↑ 0↓)
datafusion / vortex-compact (0.903x ➖, 13↑ 0↓)
datafusion / parquet (0.939x ➖, 6↑ 2↓)
datafusion / arrow (0.879x ✅, 13↑ 0↓)
duckdb / vortex-file-compressed (0.913x ➖, 6↑ 0↓)
duckdb / vortex-compact (0.933x ➖, 0↑ 0↓)
duckdb / parquet (0.948x ➖, 2↑ 0↓)
duckdb / duckdb (0.947x ➖, 1↑ 0↓)
Full attributed analysis
|
Benchmarks: FineWeb S3Verdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.130x ➖, 1↑ 1↓)
datafusion / vortex-compact (0.979x ➖, 0↑ 0↓)
datafusion / parquet (1.058x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.983x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.001x ➖, 0↑ 0↓)
duckdb / parquet (0.948x ➖, 0↑ 0↓)
Full attributed analysis
|
|
I think you need to rebase for benchmarks to run since some of the logic changed |
Polar Signals Profiling ResultsLatest Run
Previous Runs (2)
Powered by Polar Signals Cloud |
Benchmarks: Random AccessVortex (geomean): 0.854x ✅ unknown / unknown (0.966x ➖, 10↑ 4↓)
|
BENCHMARK FAILEDBenchmark |
44d4c59 to
6479854
Compare
6479854 to
857ddc7
Compare
File Sizes: PolarSignals ProfilingNo file size changes detected. |
File Sizes: FineWeb NVMeNo file size changes detected. |
File Sizes: TPC-H SF=1 on NVMENo file size changes detected. |
|
did something go wrong here? This is saying that file sizes are changing too which is somewhat unexpected but maybe it is true if this is somehow wrong? |
File Sizes: TPC-DS SF=1 on NVMENo file size changes detected. |
|
I think that while the compressor is stable if you change the order you feed files into the compressor the compression might change due to different numbers being drawn from the generator. |
File Sizes: TPC-H SF=10 on NVMENo file size changes detected. |
File Sizes: Statistical and Population GeneticsNo file size changes detected. |
|
This seems strange, why would the order in which I download things affect the order that the compressor compresses files? Downloading files should be an atomic operation, in that we download all of the files and then continue, so I don't know why this changes things. I'll dig into this a bit more. |
File Sizes: Clickbench on NVMEFile Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
857ddc7 to
61ea8c7
Compare
|
ok I think that was a phantom from other changes on develop, everything seems fine now except #7490 (comment)? |
Summary
Unifies the download management for benchmarks. Also makes the downloads smarter with AIMD and nicer progress bars.
Testing
I just ran it in my terminal and it works well enough.
Let me know if we want a video for this and I can figure that out.
Screen.Recording.2026-04-16.at.1.33.27.PM.mov