Skip to content

Commit 4084a18

Browse files
kosiewalamb
andauthored
Simplify sqllogictest timing summary to boolean flag and remove top-N modes (apache#20598)
## Which issue does this PR close? - [Follow up action on apache#20569 ](apache#20569 (review)) ## Rationale for this change The previous implementation introduced multiple timing summary modes (`auto`, `off`, `top`, `full`) and a dedicated `--timing-top-n` flag. Upon review, this additional complexity does not appear necessary. The only functionality that requires internal support within `sqllogictests` is deterministic per-file timing capture and reporting. Output shaping (such as limiting results to the top N entries) can be achieved using standard Unix tooling, for example: ```bash cargo test --test sqllogictests -- --timing-summary | head -n 10 ``` This approach: * Keeps the core implementation simpler * Reduces the number of specialized flags and modes * Aligns with the Unix philosophy of composable tools * Improves long-term maintainability by avoiding feature creep In short, this change streamlines the timing functionality to focus solely on deterministic per-file timing output, while delegating output filtering to external tools. ## What changes are included in this PR? * Replaced `TimingSummaryMode` enum (`auto|off|top|full`) with a simple boolean `--timing-summary` flag * Removed `--timing-top-n` and related parsing logic * Removed environment variable `SLT_TIMING_TOP_N` * Simplified `print_timing_summary` to: * Print full deterministic per-file timing output when enabled * Write directly to stdout using a locked handle * Remove truncation and omission logic * Suppressed periodic CI progress output when deterministic timing summary is enabled to keep output stable * Updated README documentation to reflect: * Timing summary is disabled by default * Enabled via `--timing-summary` or `SLT_TIMING_SUMMARY=1` * Examples using shell tools (e.g., `head -n 10`) instead of built-in top-N modes ## Are these changes tested? No new tests were added. This change refactors CLI behavior and output formatting but does not alter core test execution logic. Existing `sqllogictest` coverage continues to validate execution behavior. The timing summary functionality can be manually verified by running: ```bash cargo test --test sqllogictests -- --timing-summary ``` Because the implementation is simplified (removing conditional branches and modes), overall surface area for bugs is reduced. ## Are there any user-facing changes? Yes. * The following flags and modes have been removed: * `--timing-summary auto|off|top|full` * `--timing-top-n` * `SLT_TIMING_TOP_N` * `--timing-summary` is now a boolean flag (disabled by default). * When enabled, it prints the full deterministic per-file timing summary. * Users who want only the top N results should use standard shell tooling (e.g., `head`, `tail`). This is a CLI behavior change but results in a simpler and more predictable interface. ## LLM-generated code disclosure This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested. --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
1 parent a074902 commit 4084a18

File tree

2 files changed

+35
-89
lines changed

2 files changed

+35
-89
lines changed

datafusion/sqllogictest/README.md

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -75,24 +75,25 @@ RUST_LOG=debug cargo test --test sqllogictests -- ddl
7575
The sqllogictest runner can emit deterministic per-file elapsed timings to help
7676
identify slow test files.
7777

78-
By default (`--timing-summary auto`), timing summary output is disabled in local
79-
TTY runs and shows a top-slowest summary in non-TTY/CI runs.
78+
Timing summary output is disabled by default and enabled with
79+
`--timing-summary` (or `SLT_TIMING_SUMMARY=true`).
8080

81-
`--timing-top-n` / `SLT_TIMING_TOP_N` must be a positive integer (`>= 1`).
81+
When timing summary is enabled, periodic `Progress:` lines are suppressed by
82+
default to keep output stable.
8283

8384
```shell
84-
# Show top 10 slowest files (good for CI)
85-
cargo test --test sqllogictests -- --timing-summary top --timing-top-n 10
85+
# Show deterministic per-file elapsed timings (sorted slowest first)
86+
cargo test --test sqllogictests -- --timing-summary
8687
```
8788

8889
```shell
89-
# Show full per-file timing table
90-
cargo test --test sqllogictests -- --timing-summary full
90+
# Keep only the top 10 lines using standard shell tooling
91+
cargo test --test sqllogictests -- --timing-summary | head -n 10
9192
```
9293

9394
```shell
94-
# Same controls via environment variables
95-
SLT_TIMING_SUMMARY=top SLT_TIMING_TOP_N=15 cargo test --test sqllogictests
95+
# Enable via environment variable
96+
SLT_TIMING_SUMMARY=1 cargo test --test sqllogictests
9697
```
9798

9899
```shell

datafusion/sqllogictest/bin/sqllogictests.rs

Lines changed: 25 additions & 80 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
// specific language governing permissions and limitations
1616
// under the License.
1717

18-
use clap::{ColorChoice, Parser, ValueEnum};
18+
use clap::{ColorChoice, Parser};
1919
use datafusion::common::instant::Instant;
2020
use datafusion::common::utils::get_available_parallelism;
2121
use datafusion::common::{DataFusionError, Result, exec_datafusion_err, exec_err};
@@ -44,7 +44,7 @@ use crate::postgres_container::{
4444
use datafusion::common::runtime::SpawnedTask;
4545
use futures::FutureExt;
4646
use std::fs;
47-
use std::io::{IsTerminal, stderr, stdout};
47+
use std::io::{IsTerminal, Write, stderr, stdout};
4848
use std::path::{Path, PathBuf};
4949
use std::str::FromStr;
5050
use std::sync::atomic::{AtomicUsize, Ordering};
@@ -62,14 +62,6 @@ const SQLITE_PREFIX: &str = "sqlite";
6262
const ERRS_PER_FILE_LIMIT: usize = 10;
6363
const TIMING_DEBUG_SLOW_FILES_ENV: &str = "SLT_TIMING_DEBUG_SLOW_FILES";
6464

65-
#[derive(Clone, Copy, Debug, Eq, PartialEq, ValueEnum)]
66-
enum TimingSummaryMode {
67-
Auto,
68-
Off,
69-
Top,
70-
Full,
71-
}
72-
7365
#[derive(Debug)]
7466
struct FileTiming {
7567
relative_path: PathBuf,
@@ -187,8 +179,11 @@ async fn run_tests() -> Result<()> {
187179
}
188180

189181
let num_tests = test_files.len();
190-
// For CI environments without TTY, print progress periodically
182+
// For CI environments without TTY, print progress periodically unless
183+
// deterministic timing summary output is requested.
191184
let is_ci = !stderr().is_terminal();
185+
let print_periodic_progress = is_ci && !options.timing_summary;
186+
let progress_interval = std::cmp::max(1, num_tests / 10);
192187
let completed_count = Arc::new(AtomicUsize::new(0));
193188

194189
let file_results: Vec<_> = futures::stream::iter(test_files)
@@ -307,8 +302,12 @@ async fn run_tests() -> Result<()> {
307302
let completed_count = Arc::clone(&completed_count);
308303
move |_| {
309304
let completed = completed_count.fetch_add(1, Ordering::Relaxed) + 1;
310-
// In CI (no TTY), print progress every 10% or every 50 files
311-
if is_ci && (completed.is_multiple_of(50) || completed == num_tests) {
305+
// Print progress at 10% intervals, every 50 files, and completion.
306+
if print_periodic_progress
307+
&& (completed.is_multiple_of(progress_interval)
308+
|| completed.is_multiple_of(50)
309+
|| completed == num_tests)
310+
{
312311
eprintln!(
313312
"Progress: {}/{} files completed ({:.0}%)",
314313
completed,
@@ -335,7 +334,7 @@ async fn run_tests() -> Result<()> {
335334
.then_with(|| a.relative_path.cmp(&b.relative_path))
336335
});
337336

338-
print_timing_summary(&options, &m, is_ci, &file_timings)?;
337+
print_timing_summary(&options, &file_timings)?;
339338

340339
let errors: Vec<_> = file_results
341340
.into_iter()
@@ -395,44 +394,23 @@ async fn run_tests() -> Result<()> {
395394
}
396395
}
397396

398-
fn print_timing_summary(
399-
options: &Options,
400-
progress: &MultiProgress,
401-
is_ci: bool,
402-
file_timings: &[FileTiming],
403-
) -> Result<()> {
404-
let mode = options.timing_summary_mode(is_ci);
405-
if mode == TimingSummaryMode::Off || file_timings.is_empty() {
397+
fn print_timing_summary(options: &Options, file_timings: &[FileTiming]) -> Result<()> {
398+
if !options.timing_summary || file_timings.is_empty() {
406399
return Ok(());
407400
}
408401

409-
let top_n = options.timing_top_n;
410-
debug_assert!(matches!(
411-
mode,
412-
TimingSummaryMode::Top | TimingSummaryMode::Full
413-
));
414-
let count = if mode == TimingSummaryMode::Full {
415-
file_timings.len()
416-
} else {
417-
top_n
418-
};
419-
420-
progress.println("Per-file elapsed summary (deterministic):")?;
421-
for (idx, timing) in file_timings.iter().take(count).enumerate() {
422-
progress.println(format!(
402+
let mut output = stdout().lock();
403+
writeln!(output, "Per-file elapsed summary (deterministic):")?;
404+
for (idx, timing) in file_timings.iter().enumerate() {
405+
writeln!(
406+
output,
423407
"{:>3}. {:>8.3}s {}",
424408
idx + 1,
425409
timing.elapsed.as_secs_f64(),
426410
timing.relative_path.display()
427-
))?;
428-
}
429-
430-
if mode != TimingSummaryMode::Full && file_timings.len() > count {
431-
progress.println(format!(
432-
"... {} more files omitted (use --timing-summary full to show all)",
433-
file_timings.len() - count
434-
))?;
411+
)?;
435412
}
413+
output.flush()?;
436414

437415
Ok(())
438416
}
@@ -448,16 +426,6 @@ fn is_env_truthy(name: &str) -> bool {
448426
})
449427
}
450428

451-
fn parse_timing_top_n(arg: &str) -> std::result::Result<usize, String> {
452-
let parsed = arg
453-
.parse::<usize>()
454-
.map_err(|error| format!("invalid value '{arg}': {error}"))?;
455-
if parsed == 0 {
456-
return Err("must be >= 1".to_string());
457-
}
458-
Ok(parsed)
459-
}
460-
461429
async fn run_test_file_substrait_round_trip(
462430
test_file: TestFile,
463431
validator: Validator,
@@ -927,20 +895,10 @@ struct Options {
927895
#[clap(
928896
long,
929897
env = "SLT_TIMING_SUMMARY",
930-
value_enum,
931-
default_value_t = TimingSummaryMode::Auto,
932-
help = "Per-file timing summary mode: auto|off|top|full"
898+
default_value_t = false,
899+
help = "Print deterministic per-file timing summary"
933900
)]
934-
timing_summary: TimingSummaryMode,
935-
936-
#[clap(
937-
long,
938-
env = "SLT_TIMING_TOP_N",
939-
default_value_t = 10,
940-
value_parser = parse_timing_top_n,
941-
help = "Number of files to show when timing summary mode is auto/top (must be >= 1)"
942-
)]
943-
timing_top_n: usize,
901+
timing_summary: bool,
944902

945903
#[clap(
946904
long,
@@ -952,19 +910,6 @@ struct Options {
952910
}
953911

954912
impl Options {
955-
fn timing_summary_mode(&self, is_ci: bool) -> TimingSummaryMode {
956-
match self.timing_summary {
957-
TimingSummaryMode::Auto => {
958-
if is_ci {
959-
TimingSummaryMode::Top
960-
} else {
961-
TimingSummaryMode::Off
962-
}
963-
}
964-
mode => mode,
965-
}
966-
}
967-
968913
/// Because this test can be run as a cargo test, commands like
969914
///
970915
/// ```shell

0 commit comments

Comments
 (0)