7986: metrics: add task schedule latency metric by martin-augment · Pull Request #84 · martin-augment/tokio

martin-augment · 2026-03-25T06:26:16Z

7986: To review by AI

This new metric tracks the amount of time between when a task is scheduled and when it is polled, also known as queue delay. This duration is recorded in a histogram, just like the poll time metric. This metric is useful for implementing queue management algorithms in systems using tokio. For example, it could be used to implement a generic http load shedder using the CoDel algorithm.

coderabbitai · 2026-03-25T06:26:28Z

Walkthrough

This pull request adds optional task schedule latency histogram instrumentation to the Tokio runtime. New builder methods enable and configure schedule latency tracking. The feature records the elapsed time between when a task is scheduled and when it begins polling. Per-task timestamps are captured in the scheduler when tasks are enqueued. The instrumentation propagates through runtime configuration, worker metrics, and batch processing. Both current-thread and multi-thread runtimes now support this feature behind tokio_unstable. New public accessor methods expose schedule latency histogram data, and a test validates the metrics collection.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch pr-7986-2026-03-25-06-26-12

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-03-25T06:26:35Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances Tokio's runtime metrics by introducing the ability to measure and report task schedule latency. It provides a granular view into how long tasks wait in the scheduler queue before execution, offering valuable data for performance tuning and bottleneck identification. The changes include new configuration options for runtime builders and an expanded metrics API to access this new data, all while ensuring that the overhead is opt-in.

Highlights

New Metric: Task Schedule Latency: Introduced a new unstable metric to track the time elapsed between a task being scheduled for execution and when it is actually polled by a worker. This metric helps in understanding scheduling overheads.
Configurable Histograms: Users can now enable and configure histograms for task schedule latencies using tokio::runtime::Builder. This includes options for linear or logarithmic histograms with customizable bucket configurations.
Runtime Metrics API: Added new methods to RuntimeMetrics to query the status, number of buckets, bucket ranges, and counts for the task schedule latency histograms, providing detailed insights into task scheduling performance.
Performance Consideration: The task schedule latency metric is not enabled by default due to the overhead of calling Instant::now() twice per task poll, allowing users to opt-in when detailed performance analysis is required.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new unstable metrics feature to the Tokio runtime, enabling the tracking of task schedule latencies. This involves adding configuration options to the Builder, updating internal metric structures (Config, MetricsBatch, RuntimeMetrics, WorkerMetrics), and modifying the schedulers and task core to record and utilize task scheduling timestamps for latency calculation. A new test case has been added to verify this functionality. The review comments suggest improving the documentation for this new feature by clarifying the phrasing regarding when Instant::now() is called for latency measurement, specifically that it occurs once at scheduling and once at the start of polling, rather than "twice per task poll."

gemini-code-assist · 2026-03-25T06:28:31Z

tokio/src/runtime/builder.rs

+        /// execution and when it is polled.
+        ///
+        /// Task schedule latencies are not instrumented by default as doing
+        /// so requires calling [`Instant::now()`] twice per task poll, which


The phrasing 'twice per task poll' is a bit misleading. This feature calls Instant::now() once when a task is scheduled, and once when it begins to be polled. For a task that is polled multiple times after being scheduled once, this is not 'twice per poll'. A more accurate description would be clearer for users trying to understand the performance overhead.

Suggested change

/// so requires calling [`Instant::now()`] twice per task poll, which

/// so requires calling [`Instant::now()`] when a task is scheduled and when it is polled, which

value:good-to-have; category:documentation; feedback: The Gemini AI reviewer is correct! The proposed wording of the docstring suggests that Instant::now() is called twice on each poll. This is not correct and it should be fixed to explain that the first call to Instant::now() is when the task is scheduled.

gemini-code-assist · 2026-03-25T06:28:31Z

tokio/src/runtime/metrics/runtime.rs

+        /// schedule latencies.
+        ///
+        /// Task schedule latencies times are not instrumented by default as doing
+        /// so requires calling [`Instant::now()`] twice per task poll. The feature


The phrasing 'twice per task poll' is a bit misleading. This feature calls Instant::now() once when a task is scheduled, and once when it begins to be polled. For a task that is polled multiple times after being scheduled once, this is not 'twice per poll'. A more accurate description would be clearer for users trying to understand the performance overhead.

Suggested change

/// so requires calling [`Instant::now()`] twice per task poll. The feature

/// so requires calling [`Instant::now()`] when a task is scheduled and when it is polled. The feature

value:good-to-have; category:documentation; feedback: The Gemini AI reviewer is correct! The proposed wording of the docstring suggests that Instant::now() is called twice on each poll. This is not correct and it should be fixed to explain that the first call to Instant::now() is when the task is scheduled.

augmentcode · 2026-03-25T06:34:06Z

🤖 Augment PR Summary

Summary: Adds an (unstable) runtime metric for task schedule latency: the time between a task being scheduled and when it is next polled.

Changes:

Extends runtime::Builder with an opt-in enable_metrics_schedule_latency_histogram() flag and a histogram configuration API.
Plumbs an optional schedule-latency HistogramBuilder through runtime Config and into per-worker WorkerMetrics.
Enhances MetricsBatch to accumulate and submit schedule-latency histogram observations.
Instruments both current-thread and multi-thread schedulers to record a per-task “scheduled at” timestamp and pass it into poll-start accounting.
Adds new RuntimeMetrics accessors to query enablement, bucket configuration, and per-bucket counts.
Introduces an unstable-metrics test that spawns tasks on both runtime flavors and validates the histogram counts sum to the number of tasks.

Technical Notes: The default histogram uses a low-footprint linear configuration (10 buckets × 100µs), with optional log histograms for higher granularity.

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. 2 suggestions posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-03-25T06:34:07Z

tokio/src/runtime/scheduler/multi_thread/worker.rs

+        // is only read when polling the task, which can only happen after it's scheduled.
+        #[cfg(tokio_unstable)]
+        unsafe {
+            task.set_scheduled_at(std::time::Instant::now());


task.set_scheduled_at(Instant::now()) runs for every schedule under cfg(tokio_unstable), even if the schedule-latency histogram feature wasn’t enabled; this seems to undermine the “not instrumented by default” contract (and can re-introduce time calls on targets where time metrics are otherwise gated). Consider guarding the timestamp write on whether a schedule-latency histogram is actually configured.

Severity: medium

Other Locations

tokio/src/runtime/scheduler/current_thread/mod.rs:683

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

value:useful; category:bug; feedback: The Augment AI reviewer is correct! The scheduled time is set for all unstable builds, no matter whether the metric is enabled or not. Prevents extra CPU work and memory usage when the metric is not enabled.

augmentcode · 2026-03-25T06:34:07Z

tokio/src/runtime/metrics/batch.rs

                            poll_started_at: now,
                        })
                });
+                let schedule_latencies = worker_metrics


schedule_latencies is initialized even when maybe_now is None, but start_poll later falls back to Instant::now() when poll_timer is None, which appears to bypass the existing wasm32-unknown-unknown time-metrics gating. Consider gating schedule-latency histogram collection similarly (e.g., only enabling it when maybe_now is Some).

Severity: medium

Other Locations

tokio/src/runtime/metrics/batch.rs:233

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

value:useful; category:bug; feedback: The Augment AI reviewer is correct! There is no point in initializing schedule_latencies if maybe_now is None. Prevents using memory for something that won't be used later.

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-03-25T06:38:12Z

tokio/src/runtime/scheduler/current_thread/mod.rs

+        #[cfg(tokio_unstable)]
+        unsafe {
+            task.set_scheduled_at(std::time::Instant::now());
+        }


Schedule timestamping always incurs now() overhead

Medium Severity

set_scheduled_at(Instant::now()) runs on every schedule in tokio_unstable, even when schedule-latency metrics are disabled. This adds avoidable timing overhead globally instead of only when enable_metrics_schedule_latency_histogram() is enabled.

Additional Locations (1)

tokio/src/runtime/scheduler/multi_thread/worker.rs#L1276-L1283

value:useful; category:bug; feedback: The Bugbot AI reviewer is correct! The scheduled time is set for all unstable builds, no matter whether the metric is enabled or not. Prevents extra CPU work and memory usage when the metric is not enabled.

cursor · 2026-03-25T06:38:12Z

tokio/src/runtime/scheduler/multi_thread/worker.rs

+        #[cfg(tokio_unstable)]
+        core.stats.start_poll(task.get_scheduled_at());
+        #[cfg(not(tokio_unstable))]
+        core.stats.start_poll(None);


LIFO-polled tasks skip latency histogram measurement

Medium Severity

Latency measurement is started only for the initial task entering run_task. Tasks later polled from the LIFO slot in the same loop do not call start_poll with their own scheduled_at, so their schedule latency is never recorded.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

tokio/tests/rt_unstable_metrics.rs (1)

816-824: Pin the multi-thread case to 2 workers.

Leaving new_multi_thread() at the default can collapse to a single worker on constrained CI, which means this test stops exercising the threaded scheduling path. Setting .worker_threads(2) keeps the coverage deterministic.

🧪 Suggested tweak

         tokio::runtime::Builder::new_multi_thread()
+            .worker_threads(2)
             .enable_all()
             .enable_metrics_schedule_latency_histogram()
             .metrics_schedule_latency_histogram_configuration(HistogramConfiguration::linear(

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tokio/tests/rt_unstable_metrics.rs` around lines 816 - 824, The multi-thread
runtime in the test uses tokio::runtime::Builder::new_multi_thread() without
fixing the worker count, which can collapse to one worker on constrained CI;
update the Builder chain used in the test
(tokio::runtime::Builder::new_multi_thread()) to call .worker_threads(2) so the
multi-thread scheduling path is exercised deterministically alongside
.enable_all() and the existing metrics configuration.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tokio/src/runtime/metrics/batch.rs`:
- Around line 231-235: The fallback uses Instant::now() directly which bypasses
this module's wasm-safe time gate; change the unwrap_or_else(Instant::now) call
so it uses the module's wasm-safe now() (the same helper used elsewhere in this
file) instead, i.e., replace Instant::now with the local/same-module now
function when computing let now = self.poll_timer.as_ref().map(|p|
p.poll_started_at).unwrap_or_else(...), so duration_as_u64(now -
task_scheduled_at) and schedule_latencies.measure remain wasm-safe.

In `@tokio/src/runtime/metrics/runtime.rs`:
- Around line 578-579: Fix the small doc typo in the runtime metrics comment:
change the phrase "Task schedule latencies times" to "Task schedule latencies"
in the comment that also references Instant::now(), ensuring the rest of the
sentence remains unchanged.

In `@tokio/src/runtime/scheduler/multi_thread/worker.rs`:
- Around line 632-635: The LIFO-drained tasks in run_task (the loop that drains
core.lifo_slot) are missing schedule-latency samples; add a call that records
schedule latency for each task popped from core.lifo_slot before it is
polled—mirror the behavior of core.stats.start_poll(task.get_scheduled_at())
used for the first task by invoking the same sampling path (or a dedicated
record_schedule_latency(task.get_scheduled_at()) helper) for every lifo_slot
task; ensure this is done under the same cfg branches (tokio_unstable vs not)
and reference core.stats.start_poll / record_schedule_latency and
task.get_scheduled_at so the histogram receives samples for the fast-path tasks.
- Around line 1277-1283: The code unconditionally calls Instant::now() in the
#[cfg(tokio_unstable)] unsafe block (task.set_scheduled_at(...)), which runs on
every multi-thread schedule; wrap that write in a check of the histogram being
enabled (use self.shared.config.metrics_schedule_latency_histogram.is_some()) so
the timestamp is only captured when the histogram is configured, and apply the
same guard around the analogous timestamp write in current_thread::mod.rs (lines
referenced in the review) to avoid the extra work on the hot path.

---

Nitpick comments:
In `@tokio/tests/rt_unstable_metrics.rs`:
- Around line 816-824: The multi-thread runtime in the test uses
tokio::runtime::Builder::new_multi_thread() without fixing the worker count,
which can collapse to one worker on constrained CI; update the Builder chain
used in the test (tokio::runtime::Builder::new_multi_thread()) to call
.worker_threads(2) so the multi-thread scheduling path is exercised
deterministically alongside .enable_all() and the existing metrics
configuration.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d403ad45-2126-4834-8abc-7e6016c5d878

📥 Commits

Reviewing files that changed from the base of the PR and between c65333e and 096b3a4.

📒 Files selected for processing (11)

tokio/src/runtime/builder.rs
tokio/src/runtime/config.rs
tokio/src/runtime/metrics/batch.rs
tokio/src/runtime/metrics/runtime.rs
tokio/src/runtime/metrics/worker.rs
tokio/src/runtime/scheduler/current_thread/mod.rs
tokio/src/runtime/scheduler/multi_thread/stats.rs
tokio/src/runtime/scheduler/multi_thread/worker.rs
tokio/src/runtime/task/core.rs
tokio/src/runtime/task/mod.rs
tokio/tests/rt_unstable_metrics.rs

coderabbitai · 2026-03-25T06:39:10Z

tokio/src/runtime/metrics/batch.rs

+                if let Some(task_scheduled_at) = task_scheduled_at {
+                    if let Some(schedule_latencies) = &mut self.schedule_latencies {
+                        let now = self.poll_timer.as_ref().map(|p| p.poll_started_at).unwrap_or_else(Instant::now);
+                        let elapsed = duration_as_u64(now - task_scheduled_at);
+                        schedule_latencies.measure(elapsed, 1);


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify that `start_poll` currently bypasses the wasm-safe `now()` helper. rg -n -C3 'fn now\(|pub\(crate\) fn start_poll\(|schedule_latencies|Instant::now\(' tokio/src/runtime/metrics/batch.rs

Repository: martin-augment/tokio

Length of output: 2620

🏁 Script executed:

# Check all direct Instant::now() usages in the file to understand scope rg -n 'Instant::now' tokio/src/runtime/metrics/batch.rs

Repository: martin-augment/tokio

Length of output: 285

🏁 Script executed:

# Verify the now() function signature and check how/where it's used correctly rg -n 'self\.now\(\)|\.now\(\)|\bor_else\(now\)' tokio/src/runtime/metrics/batch.rs

Repository: martin-augment/tokio

Length of output: 46

🏁 Script executed:

# Verify poll_timer can be None at line 233 and understand initialization rg -n -B10 -A2 'pub\(crate\) fn start_poll' tokio/src/runtime/metrics/batch.rs | head -40

Repository: martin-augment/tokio

Length of output: 817

🏁 Script executed:

# Check how poll_timer field is declared and when it can be Some/None rg -n -B5 'poll_timer:' tokio/src/runtime/metrics/batch.rs | head -30

Repository: martin-augment/tokio

Length of output: 235

Use the wasm-safe time gate in schedule-latency measurement.

Line 233 falls back to Instant::now() directly, which bypasses this module's now() guard for unsupported wasm32-unknown-unknown time metrics. This can break the target-specific safety contract and should be gated the same way as other timing paths.

🔧 Proposed fix

- if let Some(task_scheduled_at) = task_scheduled_at { - if let Some(schedule_latencies) = &mut self.schedule_latencies { - let now = self.poll_timer.as_ref().map(|p| p.poll_started_at).unwrap_or_else(Instant::now); - let elapsed = duration_as_u64(now - task_scheduled_at); - schedule_latencies.measure(elapsed, 1); - } - } + if let Some(task_scheduled_at) = task_scheduled_at { + if let Some(schedule_latencies) = &mut self.schedule_latencies { + if let Some(now) = self + .poll_timer + .as_ref() + .map(|p| p.poll_started_at) + .or_else(now) + { + let elapsed = duration_as_u64(now.saturating_duration_since(task_scheduled_at)); + schedule_latencies.measure(elapsed, 1); + } + } + }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tokio/src/runtime/metrics/batch.rs` around lines 231 - 235, The fallback uses Instant::now() directly which bypasses this module's wasm-safe time gate; change the unwrap_or_else(Instant::now) call so it uses the module's wasm-safe now() (the same helper used elsewhere in this file) instead, i.e., replace Instant::now with the local/same-module now function when computing let now = self.poll_timer.as_ref().map(|p| p.poll_started_at).unwrap_or_else(...), so duration_as_u64(now - task_scheduled_at) and schedule_latencies.measure remain wasm-safe.

coderabbitai · 2026-03-25T06:39:10Z

tokio/src/runtime/metrics/runtime.rs

+        /// Task schedule latencies times are not instrumented by default as doing
+        /// so requires calling [`Instant::now()`] twice per task poll. The feature


⚠️ Potential issue | 🟡 Minor

Fix small documentation typo.

Line 578 says “Task schedule latencies times”; remove “times” for clarity.

✏️ Suggested doc fix

- /// Task schedule latencies times are not instrumented by default as doing + /// Task schedule latencies are not instrumented by default as doing

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

/// Task schedule latencies times are not instrumented by default as doing

/// so requires calling [`Instant::now()`] twice per task poll. The feature

/// Task schedule latencies are not instrumented by default as doing

/// so requires calling [`Instant::now()`] twice per task poll. The feature

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tokio/src/runtime/metrics/runtime.rs` around lines 578 - 579, Fix the small doc typo in the runtime metrics comment: change the phrase "Task schedule latencies times" to "Task schedule latencies" in the comment that also references Instant::now(), ensuring the rest of the sentence remains unchanged.

value:good-to-have; category:documentation; feedback: The CodeRabbit AI reviewer is correct! "Latencies" implies "times", so there is no need to use them both

coderabbitai · 2026-03-25T06:39:10Z

tokio/src/runtime/scheduler/multi_thread/worker.rs

+        #[cfg(tokio_unstable)]
+        core.stats.start_poll(task.get_scheduled_at());
+        #[cfg(not(tokio_unstable))]
+        core.stats.start_poll(None);


⚠️ Potential issue | 🟠 Major

LIFO-slot polls are missing schedule-latency samples.

Only the first task in run_task goes through core.stats.start_poll(...). Tasks drained from core.lifo_slot at Lines 672-731 are polled without another schedule-latency record, so the new histogram undercounts exactly the fast-path tasks this metric is supposed to observe. A separate record_schedule_latency(task.get_scheduled_at()) path would preserve the existing poll-count/EWMA behavior without dropping those samples.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tokio/src/runtime/scheduler/multi_thread/worker.rs` around lines 632 - 635, The LIFO-drained tasks in run_task (the loop that drains core.lifo_slot) are missing schedule-latency samples; add a call that records schedule latency for each task popped from core.lifo_slot before it is polled—mirror the behavior of core.stats.start_poll(task.get_scheduled_at()) used for the first task by invoking the same sampling path (or a dedicated record_schedule_latency(task.get_scheduled_at()) helper) for every lifo_slot task; ensure this is done under the same cfg branches (tokio_unstable vs not) and reference core.stats.start_poll / record_schedule_latency and task.get_scheduled_at so the histogram receives samples for the fast-path tasks.

value:useful; category:bug; feedback: The CodeRabbit AI reviewer is correct! The fast-path tasks (tasks scheduled by a currently executing task that should run on the same CPU code) are not measured when polled. Prevents missing/skipping the measurement of tasks.

coderabbitai · 2026-03-25T06:39:10Z

tokio/src/runtime/scheduler/multi_thread/worker.rs

+        // SAFETY: There are no concurrent writes because tasks cannot be scheduled in
+        // multiple places concurrently. There are no concurrent reads because this field
+        // is only read when polling the task, which can only happen after it's scheduled.
+        #[cfg(tokio_unstable)]
+        unsafe {
+            task.set_scheduled_at(std::time::Instant::now());
+        }


⚠️ Potential issue | 🟠 Major

Gate timestamp capture on the histogram actually being enabled.

Instant::now() now runs on every multi-thread schedule in tokio_unstable, even when enable_metrics_schedule_latency_histogram() was never configured. That makes the feature non-optional in practice and adds avoidable work to a hot path. Please guard this write on self.shared.config.metrics_schedule_latency_histogram.is_some() and apply the same guard at Lines 678-684 in tokio/src/runtime/scheduler/current_thread/mod.rs.

⚙️ Suggested change

- #[cfg(tokio_unstable)] - unsafe { - task.set_scheduled_at(std::time::Instant::now()); - } + #[cfg(tokio_unstable)] + if self.shared.config.metrics_schedule_latency_histogram.is_some() { + unsafe { + task.set_scheduled_at(std::time::Instant::now()); + } + }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tokio/src/runtime/scheduler/multi_thread/worker.rs` around lines 1277 - 1283, The code unconditionally calls Instant::now() in the #[cfg(tokio_unstable)] unsafe block (task.set_scheduled_at(...)), which runs on every multi-thread schedule; wrap that write in a check of the histogram being enabled (use self.shared.config.metrics_schedule_latency_histogram.is_some()) so the timestamp is only captured when the histogram is configured, and apply the same guard around the analogous timestamp write in current_thread::mod.rs (lines referenced in the review) to avoid the extra work on the hot path.

value:useful; category:bug; feedback: The CodeRabbit AI reviewer is correct! The scheduled time is set for all unstable builds, no matter whether the metric is enabled or not. Prevents extra CPU work and memory usage when the metric is not enabled.

claude · 2026-03-25T06:39:21Z

PR Review for metrics: add task schedule latency metric

This PR adds a schedule latency histogram metric, closely mirroring the existing poll_count_histogram pattern. Overall the implementation is solid and well-documented. Here are my findings:

Potential Bugs

Panic on Instant subtraction (batch.rs:234)

The line: let elapsed = duration_as_u64(now - task_scheduled_at);

Instant subtraction panics in debug builds if now < task_scheduled_at. While Instant is monotonic on most platforms, certain platform quirks can cause this. The existing busy_duration calculation uses .elapsed() which is safe. Consider: let elapsed = duration_as_u64(now.checked_duration_since(task_scheduled_at).unwrap_or_default());

This matches the defensive style expected in a foundational runtime library.

Design / Correctness Observations

Yielded tasks included in the metric - should be documented

set_scheduled_at is called unconditionally in schedule_task() regardless of is_yield. This means for tasks that call yield_now(), the metric measures time between the yield and the next poll, not the initial schedule-to-first-poll latency. The documentation only says time between when a task is scheduled for execution and when it is polled - it should clarify that this includes re-scheduling after yield_now().

schedule_latency_histogram_enabled() implementation detail

Checking worker_metrics(0) is an indirect proxy for is the histogram configured. The existing poll_count_histogram_enabled has the same pattern so this is consistent, and it is safe because both runtime flavors always have at least one worker.

Silent no-op when configuration is set without enabling

Calling metrics_schedule_latency_histogram_configuration() without first calling enable_metrics_schedule_latency_histogram() silently discards the configuration. The docs do not mention this requirement. Consider adding a note: This method has no effect unless enable_metrics_schedule_latency_histogram() is also called.

Misleading overhead description in builder.rs docs

The docs say the feature requires calling Instant::now() twice per task poll. In practice, when both poll-time histogram and schedule-latency histogram are enabled, the existing poll_started_at timestamp is reused. The twice wording should be corrected.

Code Quality

Missing doc comment on WorkerMetrics::schedule_latency_histogram

The existing poll_count_histogram field has a doc comment. The new schedule_latency_histogram field is missing one.

Long line in batch.rs:233

let now = self.poll_timer.as_ref().map(|p| p.poll_started_at).unwrap_or_else(Instant::now); should be split across lines for readability.

Test Coverage

The test covers both current_thread and multi_thread and validates total counts equal N - good baseline. Suggestions:

Add a test exercising latency bucketing (sleep > 50ms, assert counts in bucket >= 1).
Add a test for the disabled case: verify schedule_latency_histogram_enabled() returns false without enabling.
Add a test that configuration without enabling does not panic.

Summary

The implementation is well-structured, follows existing patterns well, has comprehensive API docs, and the unsafe usage is sound. The main actionable item is the checked_duration_since fix to avoid a potential panic.

martin-augment · 2026-03-25T09:20:57Z

9. Misleading overhead description in builder.rs docs

The docs say the feature requires calling Instant::now() twice per task poll. In practice, when both poll-time histogram and schedule-latency histogram are enabled, the existing poll_started_at timestamp is reused. The twice wording should be corrected.

value:good-to-have; category:documentation; feedback: The Claude AI reviewer is correct! The proposed wording of the docstring suggests that Instant::now() is called twice on each poll. This is not correct and it should be fixed to explain that the first call to Instant::now() is when the task is scheduled.

martin-augment · 2026-03-25T09:29:46Z

816-824: Pin the multi-thread case to 2 workers.

Leaving new_multi_thread() at the default can collapse to a single worker on constrained CI, which means this test stops exercising the threaded scheduling path. Setting .worker_threads(2) keeps the coverage deterministic.

value:useful; category:bug; feedback: The CodeRabbit AI reviewer is correct! The new test won't make much sense if the machine where it is executed has only one CPU core. By using two working threads it will always serve its purpose, no matter the hardware it is running on.

gemini-code-assist bot reviewed Mar 25, 2026

View reviewed changes

augmentcode bot reviewed Mar 25, 2026

View reviewed changes

cursor bot reviewed Mar 25, 2026

View reviewed changes

coderabbitai bot reviewed Mar 25, 2026

View reviewed changes

	/// so requires calling [`Instant::now()`] twice per task poll, which
	/// so requires calling [`Instant::now()`] when a task is scheduled and when it is polled, which

		/// Task schedule latencies times are not instrumented by default as doing
		/// so requires calling [`Instant::now()`] twice per task poll. The feature

Conversation

martin-augment commented Mar 25, 2026

Uh oh!

coderabbitai bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Uh oh!

gemini-code-assist bot commented Mar 25, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

martin-augment Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

martin-augment Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

augmentcode bot commented Mar 25, 2026

Uh oh!

augmentcode bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

martin-augment Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

martin-augment Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 25, 2026

Choose a reason for hiding this comment

Schedule timestamping always incurs now() overhead

Uh oh!

martin-augment Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 25, 2026

Choose a reason for hiding this comment

LIFO-polled tasks skip latency histogram measurement

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

martin-augment Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

martin-augment Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

coderabbitai bot commented Mar 25, 2026 •

edited

Loading

martin-augment Mar 25, 2026 •

edited

Loading

martin-augment Mar 25, 2026 •

edited

Loading

martin-augment commented Mar 25, 2026 •

edited

Loading