3628: docs: expand profiling guide with JVM and async-profiler coverage by martin-augment · Pull Request #43 · martin-augment/datafusion-comet

martin-augment · 2026-03-06T11:41:42Z

3628: To review by AI

Rename profiling_native_code.md to profiling.md and add sections for async-profiler (unified JVM + native flame graphs), Java Flight Recorder, a tool comparison table, and practical tips for profiling Comet's mixed JVM/Rust execution.

gemini-code-assist · 2026-03-06T11:42:00Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the documentation for profiling Comet applications by consolidating and expanding existing guides into a single, comprehensive resource. It aims to provide developers with clear instructions and recommendations for using various profiling tools across JVM and native Rust components, improving the ability to diagnose performance issues and optimize code effectively.

Highlights

Expanded Profiling Guide: A new comprehensive profiling guide has been added, consolidating information on various tools for Comet development, which involves both JVM (Spark) and native (Rust) code.
Unified Profiling with async-profiler: The guide introduces async-profiler as the recommended tool for end-to-end Comet profiling, capable of generating unified flame graphs that include both JVM and native frames.
JVM Profiling with Java Flight Recorder (JFR): Detailed instructions and event types for using Java Flight Recorder (JFR) are provided for JVM-level investigations, such as GC pressure, allocations, and thread contention.
Native Rust Profiling with cargo-flamegraph: The guide includes updated information on profiling isolated Rust code using cargo-flamegraph for micro-benchmarks.
Consolidated Documentation: The previous 'Profiling Native Code' guide has been removed, and its relevant content, along with new comprehensive profiling strategies, has been integrated into the new 'Profiling' document.

Changelog

docs/source/contributor-guide/index.md
- Updated the link for the profiling guide to point to the new, broader 'Profiling' document.
docs/source/contributor-guide/profiling.md
- Added a new, comprehensive guide covering profiling tools and techniques for Comet development, including async-profiler, Java Flight Recorder (JFR), and cargo-flamegraph.
- Included detailed installation instructions, usage examples for attaching to Spark applications, and options for passing profiler flags.
- Provided guidance on choosing event types (cpu, wall, alloc, lock) and output formats for async-profiler.
- Added platform-specific notes for Linux and macOS regarding profiler setup.
- Detailed JFR usage, including adding flags to spark-submit, dynamic recording, viewing recordings, and useful JFR events for Comet debugging.
- Explained how to use cargo-flamegraph for profiling native Rust code in isolation, including running micro-benchmarks.
- Included tips for profiling Comet, such as using wall-clock profiling for JNI boundary overhead, alloc profiling for Arrow FFI, and correlating JVM and native frames.
docs/source/contributor-guide/profiling_native_code.md
- Removed the outdated and less comprehensive guide on profiling native code.

Activity

The pull request was submitted for AI review.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

claude · 2026-03-06T11:44:09Z

PR Review: docs: expand profiling guide with JVM and async-profiler coverage

This is a well-written, substantial improvement to the contributor documentation. Replacing the narrow profiling_native_code.md with a unified profiling.md that covers async-profiler, JFR, and cargo-flamegraph is exactly the right approach for a mixed JVM/Rust codebase. The tool comparison table and practical tips section are particularly valuable additions.

I verified that the --async-profiler and --jfr flags referenced in the doc actually exist in benchmarks/tpc/run.py and match the described behavior.

A few issues to address:

Bug: Broken relative links (appears twice)

The link ../../benchmarks/tpc/README.md appears on lines 123 and 177. From the location of the file (docs/source/contributor-guide/profiling.md), two ../ levels up lands at docs/, not the project root — so this resolves to docs/benchmarks/tpc/README.md, which does not exist.

The correct relative path is ../../../benchmarks/tpc/README.md.

-See [benchmarks/tpc/README.md](../../benchmarks/tpc/README.md) for details.
+See [benchmarks/tpc/README.md](../../../benchmarks/tpc/README.md) for details.

Bug: Missing `mkdir -p` before tar extraction

The installation snippet extracts into /opt/async-profiler with --strip-components=1, but that directory doesn't exist yet. The tar command will fail without creating it first:

+mkdir -p /opt/async-profiler
 tar xzf async-profiler-3.0-linux-x64.tar.gz -C /opt/async-profiler --strip-components=1

Nit: Hardcoded version in download URL

The install example pins v3.0, which will silently become outdated as new releases are made. Consider adding a note or using a variable:

# Replace 3.0 with the latest version from the releases page
ASPROF_VERSION=3.0
wget https://github.com/async-profiler/async-profiler/releases/download/v${ASPROF_VERSION}/async-profiler-${ASPROF_VERSION}-linux-x64.tar.gz
tar xzf async-profiler-${ASPROF_VERSION}-linux-x64.tar.gz -C /opt/async-profiler --strip-components=1

Suggestion: Note on distributed Spark output file locations

The -agentpath: spark-submit example writes output to a fixed local path (file=executor.html). In a multi-node Spark cluster, each executor writes to its own node's local filesystem, so these files won't be directly accessible from the driver. A brief note here would prevent confusion:

Note: In a distributed Spark cluster, executor output files are written to each executor's local filesystem. You'll need to collect them from each node (e.g., via HDFS, S3, or scp) after the run. The benchmarks/tpc/run.py --async-profiler integration handles this automatically.

Minor: `cd native` tip could be clearer

In the "Isolate Rust-only performance issues" tip:

cd native
cargo flamegraph --root --bench <benchmark_name>

This works correctly since native/ is a Cargo workspace that includes both core and spark-expr (both have benches). A small clarifying comment like # native/ is the Cargo workspace root would help readers understand why this is the right directory.

Overall this is a great addition to the contributor guide — the async-profiler section in particular fills an important gap. Once the two broken links and the missing mkdir -p are fixed, this should be good to merge.

gemini-code-assist

Code Review

This pull request significantly improves the profiling documentation by consolidating information about native code profiling and expanding it to cover JVM profiling with JFR and unified profiling with async-profiler. The new profiling.md guide is comprehensive and provides clear instructions and examples for different tools and scenarios. I've made a couple of minor suggestions to improve clarity and correct a grammatical error in the new documentation.

gemini-code-assist · 2026-03-06T11:44:17Z

docs/source/contributor-guide/profiling.md

+For continuous recording without a fixed duration:
+
+```shell
+--conf "spark.driver.extraJavaOptions=-XX:StartFlightRecording=disk=true,maxsize=500m,filename=driver.jfr"


This example for continuous JFR recording is presented as a single --conf flag, which is an incomplete command. For clarity and consistency with the preceding example, it would be better to show the full spark-submit command structure. This also provides an opportunity to include the corresponding spark.executor.extraJavaOptions which is important for profiling Spark applications.

Suggested change

--conf "spark.driver.extraJavaOptions=-XX:StartFlightRecording=disk=true,maxsize=500m,filename=driver.jfr"

spark-submit \

--conf "spark.driver.extraJavaOptions=-XX:StartFlightRecording=disk=true,maxsize=500m,filename=driver.jfr" \

--conf "spark.executor.extraJavaOptions=-XX:StartFlightRecording=disk=true,maxsize=500m,filename=executor.jfr" \

...

gemini-code-assist · 2026-03-06T11:44:17Z

docs/source/contributor-guide/profiling.md

+harness = false
+```
+
+These benchmarks are useful when for comparing performance between releases or between feature branches and the


There's a minor grammatical error here. The phrase 'when for' should likely be just 'for'.

Suggested change

These benchmarks are useful when for comparing performance between releases or between feature branches and the

These benchmarks are useful for comparing performance between releases or between feature branches and the

value:good-to-have; category:documentation; feedback: The Gemini AI reviewer is correct! There is a minor typo in the sentence that should be corrected to prevent an incorrect grammar.

augmentcode · 2026-03-06T11:46:51Z

🤖 Augment PR Summary

Summary: This PR consolidates and expands the contributor profiling documentation into a single cross-stack guide for Comet.

Changes:

Replaces the old “Profiling Native Code” page with a new unified Profiling page in the contributor guide.
Adds an async-profiler walkthrough for end-to-end profiling that includes both JVM (Spark) and native (Rust) frames.
Documents async-profiler installation, attach/agent usage, common event types, output formats, and platform-specific notes.
Adds a Java Flight Recorder (JFR) section focused on JVM-only diagnostics (GC, allocation, locks, I/O) including spark-submit and jcmd examples.
Preserves Rust-only profiling guidance via cargo bench and cargo-flamegraph, and adds practical profiling tips for Comet-specific workflows.

Why: Provide a clearer “choose the right profiler” reference for Comet development, covering JVM + native hot paths as well as native-only micro-benchmark work.

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. 4 suggestions posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-03-06T11:46:54Z

docs/source/contributor-guide/index.md

 Adding a New Expression <adding_a_new_expression>
 Tracing <tracing>
-Profiling Native Code <profiling_native_code>
+Profiling <profiling>


Since this renames/removes the old profiling_native_code page, it’s worth double-checking there aren’t any remaining references (internal links, bookmarks) to profiling_native_code that would now become broken in the built docs.

Severity: low

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

value:good-to-have; category:documentation; feedback: The Augment AI reviewer is correct! Since a documentation page is removed it is good to check whether that page is referred by another page! I just double checked and there are no other references.

augmentcode · 2026-03-06T11:46:54Z

docs/source/contributor-guide/profiling.md

+```shell
+# Linux x64
+wget https://github.com/async-profiler/async-profiler/releases/download/v3.0/async-profiler-3.0-linux-x64.tar.gz
+tar xzf async-profiler-3.0-linux-x64.tar.gz -C /opt/async-profiler --strip-components=1


tar ... -C /opt/async-profiler will fail if /opt/async-profiler doesn’t already exist (and writing to /opt often requires root), so readers following this verbatim may hit an error.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

value:good-to=have; category:documentation; feedback: The Augment AI reviewer is correct! The /opt folder may not exist or may exist but usually requires root permissions. To prevent any confusion in the readers it would be better to use some folder in the user's $HOME

augmentcode · 2026-03-06T11:46:54Z

docs/source/contributor-guide/profiling.md

+```shell
+spark-submit \
+  --conf "spark.driver.extraJavaOptions=-agentpath:/opt/async-profiler/lib/libasyncProfiler.so=start,event=cpu,file=driver.html" \
+  --conf "spark.executor.extraJavaOptions=-agentpath:/opt/async-profiler/lib/libasyncProfiler.so=start,event=cpu,file=executor.html" \


Using fixed output filenames like executor.html/executor.jfr can lead to profiles being overwritten if multiple executors/JVMs write into the same working directory on a host.

Severity: low

Other Locations

docs/source/contributor-guide/profiling.md:138

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

value:good-but-wont-fix; category:documentation; feedback: The Augment AI reviewer is correct! The profiling output files would be overwritten but usually this is the needed behavior. The developer makes some improvement and reloads the page in the browser to see whether it has a good effect. If the developer wants to compare results then (s)he need to generate unique file names, e.g. with a timestamp.

augmentcode · 2026-03-06T11:46:54Z

docs/source/contributor-guide/profiling.md

+### Integrated benchmark profiling
+
+The TPC benchmark scripts in `benchmarks/tpc/` have built-in async-profiler support via
+the `--async-profiler` flag. See [benchmarks/tpc/README.md](../../benchmarks/tpc/README.md)


The relative link to ../../benchmarks/tpc/README.md points outside the docs source tree, which may render as a broken link in the published documentation build (even though it works in the repo).

Severity: medium

Other Locations

docs/source/contributor-guide/profiling.md:177

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

value:useful; category: documentation; feedback: The Augment AI reviewer is correct! The link is broken. It should use three "go up"s to point to the wanted document. Prevents releasing documentation with broken links.

martin-augment · 2026-03-06T12:08:00Z

Bug: Broken relative links (appears twice)

The link ../../benchmarks/tpc/README.md appears on lines 123 and 177. From the location of the file (docs/source/contributor-guide/profiling.md), two ../ levels up lands at docs/, not the project root — so this resolves to docs/benchmarks/tpc/README.md, which does not exist.

The correct relative path is ../../../benchmarks/tpc/README.md.
-See [benchmarks/tpc/README.md](../../benchmarks/tpc/README.md) for details.
+See [benchmarks/tpc/README.md](../../../benchmarks/tpc/README.md) for details.

value:useful; category: documentation; feedback: The Claude AI reviewer is correct! The link is broken. It should use three "go up"s to point to the wanted document. Prevents releasing documentation with broken links.

martin-augment · 2026-03-06T12:14:23Z

Bug: Missing mkdir -p before tar extraction

The installation snippet extracts into /opt/async-profiler with --strip-components=1, but that directory doesn't exist yet. The tar command will fail without creating it first:
+mkdir -p /opt/async-profiler
 tar xzf async-profiler-3.0-linux-x64.tar.gz -C /opt/async-profiler --strip-components=1

value:good-to=have; category:documentation; feedback: The Claude AI reviewer is correct! The /opt folder may not exist or may exist but usually requires root permissions. To prevent any confusion in the readers it would be better to use some folder in the user's $HOME

andygrove added 2 commits March 3, 2026 17:59

style: format profiling guide with prettier

fa01fa9

gemini-code-assist bot reviewed Mar 6, 2026

View reviewed changes

augmentcode bot reviewed Mar 6, 2026

View reviewed changes

	These benchmarks are useful when for comparing performance between releases or between feature branches and the
	These benchmarks are useful for comparing performance between releases or between feature branches and the

Conversation

martin-augment commented Mar 6, 2026

Uh oh!

gemini-code-assist bot commented Mar 6, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

claude bot commented Mar 6, 2026

PR Review: docs: expand profiling guide with JVM and async-profiler coverage

Bug: Broken relative links (appears twice)

Bug: Missing mkdir -p before tar extraction

Nit: Hardcoded version in download URL

Suggestion: Note on distributed Spark output file locations

Minor: cd native tip could be clearer

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

martin-augment Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot commented Mar 6, 2026

Uh oh!

augmentcode bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

martin-augment Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

martin-augment Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

martin-augment Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

martin-augment Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martin-augment commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bug: Broken relative links (appears twice)

Uh oh!

martin-augment commented Mar 6, 2026

Bug: Missing mkdir -p before tar extraction

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Bug: Missing `mkdir -p` before tar extraction

Minor: `cd native` tip could be clearer

martin-augment Mar 6, 2026 •

edited

Loading

martin-augment commented Mar 6, 2026 •

edited

Loading

Bug: Missing `mkdir -p` before tar extraction