Skip to content

Improving OFI Transport Counter Polling Implementation#1218

Draft
markbrown314 wants to merge 2 commits intoSandia-OpenSHMEM:mainfrom
markbrown314:1217/development
Draft

Improving OFI Transport Counter Polling Implementation#1218
markbrown314 wants to merge 2 commits intoSandia-OpenSHMEM:mainfrom
markbrown314:1217/development

Conversation

@markbrown314
Copy link
Collaborator

@markbrown314 markbrown314 commented Jan 27, 2026

The current libfabric counter polling implementation requires a series of separate lock acquisitions fi_cntr(), fi_cntr_err(), and fi_cq_read(), that can be simplified if we only use fi_cnt_wait(). We see improvements in put latency benchmarks when the shmem_transport_ofi_put_quiet() polling code is removed and no change in functionality.

Addresses Issue #1217
For evaluation purposes only.

@markbrown314 markbrown314 self-assigned this Jan 27, 2026
@markbrown314 markbrown314 changed the title 1217/development Improving OFI Transport Counter Polling Implementation Jan 27, 2026
Do not force hard polling when XPMEM is enabled.
This conflicts with counter based polling when used with OFI transport.

Issue Sandia-OpenSHMEM#1217

Signed-off-by: Mark F. Brown <mark.f.brown@intel.com>
Replaced complex polling with simpler OFI counter wait

Issue Sandia-OpenSHMEM#1217

Signed-off-by: Mark F. Brown <mark.f.brown@intel.com>
@markbrown314 markbrown314 force-pushed the 1217/development branch 2 times, most recently from 269cb34 to b096e60 Compare January 31, 2026 17:34
cntr_put_attr.events = FI_CNTR_EVENTS_COMP;
cntr_get_attr.events = FI_CNTR_EVENTS_COMP;

#if 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to leave this dead code?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, see comment below.


/* wait for put counter to meet outstanding count value */

/* Note: the communication routines increment pending put counters before
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to update this text, as it seems no longer correct.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is incomplete. The original code had this complex counter wait mechanism, before it performs an OFI counter wait. I need to run some more analysis on why it was implemented this way in the first place (Chesterton's fence principle) before removing it completely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants