Skip to content

Enhance GUI and benchmarking tools for COBA performance analysis;#105

Closed
Hepbmstl wants to merge 16 commits intomainfrom
dev-fcn-optimizing-3.21
Closed

Enhance GUI and benchmarking tools for COBA performance analysis;#105
Hepbmstl wants to merge 16 commits intomainfrom
dev-fcn-optimizing-3.21

Conversation

@Hepbmstl
Copy link
Copy Markdown
Collaborator

@Hepbmstl Hepbmstl commented Mar 25, 2026

Description

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Code refactoring (no functional changes)
  • Performance improvement
  • Code style/formatting
  • Tests
  • CI/CD updates
  • Other (please describe):

Changes Made

Testing

Test Configuration:

  • Python version:
  • JAX version:
  • OS:

Test steps:
1.
2.
3.

Checklist

  • My code follows the code style of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published

Screenshots/Outputs (if applicable)

Additional Context

Breaking Changes


Reviewer Notes:

Summary by Sourcery

Improve binary FCNMV CUDA scatter performance selection, strengthen COBA benchmarking utilities, and refine GUI-based performance analysis for backend-focused comparisons.

Enhancements:

  • Add warp-per-row CUDA scatter kernels and heuristic auto-dispatch between thread-per-row and warp-per-row scatter implementations across data types.
  • Update JAX binary FCNMV bindings to consistently treat spikes as boolean and route scatter/gather calls through the bool-based kernels.
  • Extend CSV benchmarking tooling with persistent tags, schema-safe appends, VRAM-aware parameter generation, and reusable memory limit checks.
  • Retarget the performance analysis GUI to use backend as the primary comparison dimension and adjust labels, defaults, and filters accordingly.
  • Introduce new COBA development and boundary benchmarking scripts, including a Tkinter-based performance boundary and speedup visualization app, and refactor existing COBA benchmarks to use the shared BenchmarkTools module.

Hepbmstl added 15 commits March 12, 2026 21:51
The float-to-bool conversion of the mv operator will proceed after thorough testing
- Implemented `COBA_2005_binary_fcnmv_boundary_CsvOuput.py` for benchmarking post and pre-synaptic connection updates using JAX.
- Created `boundary_dis.py` for a GUI application to visualize performance boundaries and speedup analysis with interactive features.
- Developed `dev_COBA_binary_fcnmv.py` for benchmarking with various connection probabilities and numbers, enhancing simulation capabilities.
- Added error handling and CSV recording functionalities to capture benchmarking results effectively.
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Mar 25, 2026

Reviewer's Guide

Introduces warp-per-row CUDA scatter kernels with an auto-tuned TPR/WPR dispatcher for binary FCN matrix-vector ops, extends benchmarking/CSV tooling and adds new GUIs and scripts to explore COBA performance boundaries by backend, scale, connectivity, and VRAM limits.

Class diagram for CSV_record and BenchmarkTools enhancements

classDiagram
    class CSV_record {
        +str name
        +str suffix
        +str backend
        +str operator
        +str kernel
        +str mode
        +str dtype
        +Path output_dir
        +bool append
        +int width
        +list~dict~ rows
        +list~str~ fieldnames
        +dict _tags
        +__init__(CSV_name, kernel, mode, duration, conn, suffix, output_dir, append)
        +_write_csv(file_name, rows, fieldnames, mode)
        +add_tag(tag_name, tag_value)
        +add_row(row)
        +single_COBA_data_add(operator, data_type, backend, mode, conn_num, scale, elapsed, rate, duration, homo)
        +print_header(operator, data_type, backend, mode, conn_num, duration, homo, prob)
        +print_table_header(show_conn)
        +print_row(scale, size, elapsed, rate, conn_num)
        +record_finish(tag)
    }

    class BenchmarkToolsModule {
        +generate_params(dis_type, _N, limit_gb, target_samples, data_size, scale_max, conn_max) list
        +memory_limit(conn_nums, scale, _N, limit, data_type) bool
        +dump_jax_ir(func, args, kwargs, prefix)
    }

    class COBA_Benchmark_Script {
        +benchmark_post_conn(conn_num, conn_prob, data_type, duration, homo, backend, probs_or_conn, _N)
        +benchmark_pre_conn(conn_num, conn_prob, data_type, duration, homo, backend, probs_or_conn, _N)
    }

    CSV_record <.. BenchmarkToolsModule : uses
    COBA_Benchmark_Script ..> CSV_record : creates
    COBA_Benchmark_Script ..> BenchmarkToolsModule : calls
Loading

Class diagram for PerformanceBoundaryApp GUI

classDiagram
    class PerformanceBoundaryApp {
        +tk.Tk root
        +Optional~pd.DataFrame~ df
        +dict comboboxes
        +dict tabs
        +ttk.Combobox combo_compare_field
        +ttk.Combobox combo_target
        +ttk.Combobox combo_baseline
        +ttk.Entry entry_n
        +ttk.Entry entry_limit
        +ttk.Entry entry_contours
        +ttk.Entry entry_custom_lines
        +ttk.Entry entry_dpi
        +ttk.Entry entry_yellow_depth
        +ttk.Entry entry_blue_depth
        +ttk.Notebook notebook
        +ttk.Frame filter_row
        +__init__(root)
        +load_data()
        +update_plots()
        +export_image()
        +_setup_ui()
        +_on_compare_field_changed(event)
        +_rebuild_filter_row()
        +_subtitle(include_baseline) str
        +_render_scatter(tab, x, y, z, _N, limit_gb, x_min, x_max, y_max, z_label, cmap_name, subtitle)
        +_render_interpolation(tab, x, y, z, _N, limit_gb, x_min, x_max, y_max, z_label, cmap_name, subtitle)
        +_render_speedup_interp(tab, grid_x, grid_y, grid_z_masked, _N, limit_gb, x_min, x_max, y_max, subtitle)
        +_draw_boundaries(ax, _N, limit_gb, x_min, x_max)
        +_draw_contours_and_labels(ax, grid_x, grid_y, grid_z_masked, z_pts)
        +_draw_custom_contours(ax, grid_x, grid_y, grid_z_masked)
        +_setup_scatter_hover(tab, x, y, z, z_label)
        +_setup_interp_hover(tab, grid_x, grid_y, grid_z, z_label)
    }

    class CSV_record {
        +add_tag(tag_name, tag_value)
        +add_row(row)
        +record_finish(tag)
    }

    PerformanceBoundaryApp ..> CSV_record : reads CSV output
Loading

File-Level Changes

Change Details Files
Add warp-per-row CUDA scatter kernels and auto-dispatch between TPR and WPR based on n_pre/n_conn for all dtypes.
  • Introduce _bs_wpr_homo_kern and _bs_wpr_hetero_kern CUDA kernels that assign one warp per row and stride over n_conn with lane steps, issuing per-lane atomicAdd on the output.
  • Instantiate WPR kernels for float32, float64, float16, and bfloat16 in both homo and hetero variants via existing macro pattern.
  • Change FFI scatter entry points to choose between WPR and TPR using a polynomial threshold condition on (n_pre, n_conn), adjusting grid size for warps-per-block and guarding n_pre == 0.
  • Update comments and documentation around scatter strategy to describe WPR vs TPR crossover and refer to boundary_dis.py for the fitted threshold.
brainevent/_fcn/binary_fcnmv.cu
Wire binary scatter/gather CUDA bindings to use boolean spike masks consistently and select the new bool-based kernels.
  • Change JAX FFI kernel name selection for transpose (scatter) and non-transpose (gather) to use *_bool variants instead of dtype-specific spike suffixes.
  • Ensure spikes inputs are converted to boolean arrays before calling the FFI kernel in both transpose and non-transpose paths, while still handling float spike representations when needed.
brainevent/_fcn/binary.py
Enhance CSV benchmarking utilities with robust schema evolution, persistent row tags, VRAM-aware parameter generation, and memory-limit checks.
  • Add internal _tags dict to CSV_record to store persistent tags that are merged into each subsequent row, with add_tag API to set tags and update fieldnames immediately.
  • Modify add_row to merge tags into each row (caller values override tags) and automatically extend fieldnames based on merged keys before queuing rows.
  • Update _write_csv to read existing headers when appending, compute the union of previous and new fieldnames (preserving order), and use that for DictWriter with a default restval.
  • Introduce generate_params helper to enumerate valid (scale, conn_num) pairs under VRAM and structural constraints using uniform/log/Monte Carlo sampling, and a memory_limit helper that checks per-config VRAM usage for various data types.
dev/fcn/BenchmarkTools.py
Retarget COBA benchmarking scripts to the shared BenchmarkTools, tweak experiment grids, and improve naming of CSV outputs.
  • Refactor COBA_2005_CsvOuput scripts to import dev.fcn.BenchmarkTools as BT instead of local CsvOutput and to use BT.CSV_record and BT.memory_limit consistently.
  • Adjust scale, backend, connection count, and probability presets for binary fcnmv/fcnmm benchmarks, including narrowing some sweeps to specific points and expanding probabilities when needed.
  • Update CSV record_finish labels to more descriptive scenario names (e.g., float_mode_single_point_with_spconn-scale, Nsight, boundary).
  • Ensure both compact and bitpack variants use the shared CSV recorder and benchmarking helpers for pre- and post-synaptic modes.
dev/fcn/COBA_2005_binary_fcnmv_CsvOuput.py
dev/fcn/COBA_2005_binary_fcnmm_CsvOuput.py
dev/fcn/COBA_2005_bitpack_binary_fcnmm_CsvOuput.py
dev/fcn/COBA_2005_compact_binary_fcnmm_CsvOuput.py
Shift GUI defaults and logic from data_type-centric to backend-centric comparisons for latency and COBA views.
  • Change default color encodings and target combo labels across latency and COBA tabs to use 'backend' instead of 'data_type'.
  • Update operator-change handler to derive baseline/target lists and defaults from the backend column, falling back to other categorical columns only if needed.
  • Simplify heatmap logic to require 'backend' explicitly (dropping data_type fallback), update error messages accordingly, and adjust filter exclusion sets.
  • Ensure speedup and COBA heatmaps explicitly treat backend as the comparison dimension and validate target vs baseline backend selections.
dev/fcn/gui.py
Add new tools and scripts to explore performance boundaries and drive boundary sampling for binary fcnmv benchmarks.
  • Introduce boundary_dis.py, a Tkinter+Matplotlib app that loads CSVs and visualizes elapsed time and speedup over (scale, conn_num) with VRAM/structural boundaries, interpolation-based heatmaps, contour lines, custom iso-lines, and interactive hover tooltips.
  • Add dev_COBA_binary_fcnmv.py to run targeted COBA binary fcnmv pre/post benchmarks (for Nsight/dev work), recording results via BenchmarkTools with memory-bound filtering and tags.
  • Add COBA_2005_binary_fcnmv_boundary_CsvOuput.py to sample (scale, conn_num) within VRAM limits using generate_params, then benchmark and record boundary-focused COBA binary fcnmv performance for both pre and post modes.
dev/fcn/boundary_dis.py
dev/fcn/dev_COBA_binary_fcnmv.py
dev/fcn/COBA_2005_binary_fcnmv_boundary_CsvOuput.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • In dev_COBA_binary_fcnmv.py’s benchmark_post_conn probabilistic branch, single_COBA_data_add is called with cn even though only actual_conn_num is defined there, which will raise a NameError or at least record incorrect connectivity metadata; this should use actual_conn_num instead.
  • In BenchmarkTools.CSV_record._write_csv, csv.DictWriter is now created with restval='default', which will literally write the string "default" into any missing field; if this is not intentional, consider using None or an empty string so that absent values remain visually distinguishable from a real string value.
  • In _binary_fcnmv_cuda_kernel (binary.py), the kernel name for both scatter and gather now always uses the _bool spike suffix and you explicitly cast spikes to bool, making spike_sfx effectively unused; if this behavioral change (dropping non-bool spike paths) is intended, consider simplifying the signature and comments, otherwise restore the previous selection logic.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `dev_COBA_binary_fcnmv.py`’s `benchmark_post_conn` probabilistic branch, `single_COBA_data_add` is called with `cn` even though only `actual_conn_num` is defined there, which will raise a `NameError` or at least record incorrect connectivity metadata; this should use `actual_conn_num` instead.
- In `BenchmarkTools.CSV_record._write_csv`, `csv.DictWriter` is now created with `restval='default'`, which will literally write the string "default" into any missing field; if this is not intentional, consider using `None` or an empty string so that absent values remain visually distinguishable from a real string value.
- In `_binary_fcnmv_cuda_kernel` (binary.py), the kernel name for both scatter and gather now always uses the `_bool` spike suffix and you explicitly cast `spikes` to `bool`, making `spike_sfx` effectively unused; if this behavioral change (dropping non-bool spike paths) is intended, consider simplifying the signature and comments, otherwise restore the previous selection logic.

## Individual Comments

### Comment 1
<location path="dev/fcn/dev_COBA_binary_fcnmv.py" line_range="117-103" />
<code_context>
                 for s in scales:
                     actual_conn_num = int(s * prob * _N)
                     if actual_conn_num < 1 : actual_conn_num = 1
-                    if memory_limit(actual_conn_num, scale=s): continue
+                    if BT.memory_limit(actual_conn_num, scale=s): continue
</code_context>
<issue_to_address>
**issue (bug_risk):** Prob-based post benchmark uses an undefined `cn` variable instead of `actual_conn_num` when recording results.

In the `probs_or_conn != 'conn'` branch of `benchmark_post_conn`, `cn` is undefined; this will either raise a `NameError` or cause incorrect metadata to be logged. Use `actual_conn_num` as the connection count argument instead, e.g.:

```python
csv_recorder.single_COBA_data_add(
    'fcnmv', data_type, back, 'post', actual_conn_num, s,
    elapsed, float(rate), duration,
    homo=('homo' if homo else 'hetero'),
)
```
</issue_to_address>

### Comment 2
<location path="brainevent/_fcn/binary.py" line_range="275-284" />
<code_context>
-        kernel_name = f'fcn_binary_mv.binary_fcnmv_scatter{mode_sfx}{spike_sfx}{sfx}'
+        #kernel_name = f'fcn_binary_mv.binary_fcnmv_scatter{mode_sfx}{spike_sfx}{sfx}'
+        kernel_name = f'fcn_binary_mv.binary_fcnmv_scatter{mode_sfx}_bool{sfx}'
     else:
         # Gather mode: y[i] = sum_k weights[i,k] * is_active(spikes[indices[i,k]])
         # Auto-dispatch inside CUDA: TPR for n_conn<=512, MR for n_conn>512.
</code_context>
<issue_to_address>
**issue (bug_risk):** `spk_f` construction is inconsistent and appears unused/incorrectly overwritten.

In the non-`transpose` branch, `spk_f` is first built as a float mask and then immediately replaced by `u.math.asarray(spikes, dtype=bool)`, making the earlier construction dead code. Also, in the shown snippet `spk_f` is never used, so the conversion appears to have no effect.

Please either:
- remove the initial `spk_f` construction and keep only the representation actually needed, or
- ensure `spk_f` is the value converted and passed into the kernel if that’s what the kernel expects.

Aligning this with the kernel’s expected type will avoid subtle type/semantics issues and confusion from unused variables.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

t1 = time.time()
elapsed = t1 - t0
csv_recorder.print_row(s, n, elapsed, float(rate))
csv_recorder.single_COBA_data_add('fcnmv', data_type, back, 'post', cn, s, elapsed, float(rate), duration, homo=('homo' if homo else 'hetero'))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Prob-based post benchmark uses an undefined cn variable instead of actual_conn_num when recording results.

In the probs_or_conn != 'conn' branch of benchmark_post_conn, cn is undefined; this will either raise a NameError or cause incorrect metadata to be logged. Use actual_conn_num as the connection count argument instead, e.g.:

csv_recorder.single_COBA_data_add(
    'fcnmv', data_type, back, 'post', actual_conn_num, s,
    elapsed, float(rate), duration,
    homo=('homo' if homo else 'hetero'),
)

Comment on lines 275 to 284
else:
# Gather mode: y[i] = sum_k weights[i,k] * is_active(spikes[indices[i,k]])
# Auto-dispatch inside CUDA: TPR for n_conn<=512, MR for n_conn>512.
kernel_name = f'fcn_binary_mv.binary_fcnmv_gather{mode_sfx}{spike_sfx}{sfx}'
kernel_name = f'fcn_binary_mv.binary_fcnmv_gather{mode_sfx}_bool{sfx}'

def kernel(weights, indices, spikes):
#spikes = u.math.asarray(spikes, dtype=bool)
spikes = u.math.asarray(spikes, dtype=bool)
return jax.ffi.ffi_call(kernel_name, out_info)(weights, indices, spikes)

return kernel
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): spk_f construction is inconsistent and appears unused/incorrectly overwritten.

In the non-transpose branch, spk_f is first built as a float mask and then immediately replaced by u.math.asarray(spikes, dtype=bool), making the earlier construction dead code. Also, in the shown snippet spk_f is never used, so the conversion appears to have no effect.

Please either:

  • remove the initial spk_f construction and keep only the representation actually needed, or
  • ensure spk_f is the value converted and passed into the kernel if that’s what the kernel expects.

Aligning this with the kernel’s expected type will avoid subtle type/semantics issues and confusion from unused variables.

@Hepbmstl Hepbmstl closed this Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant