Skip to content

Conversation

@josephdviviano
Copy link
Collaborator

  • I've read the .github/CONTRIBUTING.md file
  • My code follows the typing guidelines
  • I've added appropriate tests
  • I've run pre-commit hooks locally

Description

A utility for benchmarking torchgfn against gflownet and gfnx.

@josephdviviano josephdviviano self-assigned this Dec 19, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a comprehensive benchmarking utility for comparing torchgfn against gflownet and gfnx libraries across multiple environments (hypergrid, ising, box, bitseq). The implementation includes library-specific runners, configuration management, result aggregation, and detailed documentation.

Key changes:

  • Benchmarking framework with abstract base classes and environment-specific runners for three GFlowNet libraries
  • Support for multiple environments with library compatibility checking
  • Comprehensive timing, memory tracking, and result aggregation capabilities

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
pyproject.toml Added pyright configuration to disable optional type checks
benchmark/sanity_check.py Sanity check script for comparing JAX and PyTorch matrix multiplication performance
benchmark/lib_runners/base.py Base classes defining the benchmarking interface and data structures
benchmark/lib_runners/torchgfn_runner.py TorchGFN runner implementation supporting hypergrid, ising, box, and bitseq environments
benchmark/lib_runners/gfnx_runner.py JAX-based GFNX runner with JIT compilation support
benchmark/lib_runners/gflownet_runner.py GFlowNet runner using Hydra configuration system
benchmark/lib_runners/init.py Module initialization exposing runner classes
benchmark/benchmark_libraries.py Main benchmarking script with CLI interface
benchmark/README.md Comprehensive documentation for the benchmark utility
benchmark/dependencies.sh Dependency installation script
benchmark/gfnx Git submodule reference for gfnx library
benchmark/gflownet Git submodule reference for gflownet library
.gitmodules Git submodule configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +220 to +221
# See gflownet.py line 594 vs 601.
# This fix applies to ALL environments (hypergrid, ising, ccube).
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The line number reference (594 vs 601) in the comment may become outdated as the external library evolves. Consider referencing the method name or a more stable identifier instead.

Suggested change
# See gflownet.py line 594 vs 601.
# This fix applies to ALL environments (hypergrid, ising, ccube).
# See the implementation of sample_batch in gflownet.py in the external
# gflownet library.

Copilot uses AI. Check for mistakes.
@codecov
Copy link

codecov bot commented Dec 19, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.38%. Comparing base (a47bf73) to head (d8c0223).
⚠️ Report is 31 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##           master     #460       +/-   ##
===========================================
+ Coverage    0.55%   74.38%   +73.83%     
===========================================
  Files          48       47        -1     
  Lines        6845     6891       +46     
  Branches      802      825       +23     
===========================================
+ Hits           38     5126     +5088     
+ Misses       6806     1454     -5352     
- Partials        1      311      +310     
Flag Coverage Δ
unittests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants