Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion agave
Submodule agave updated 109 files
46 changes: 46 additions & 0 deletions src/app/fdctl/config/default.toml
Original file line number Diff line number Diff line change
Expand Up @@ -983,6 +983,52 @@ dynamic_port_range = "8900-9000"
# "operation not supported".
xdp_zero_copy = false

# This option moves the management of napi including
# when to poll as well as the poll budget, into userspace
# if in "prefbusy" mode. The fallback is "softirq" mode,
# which relies significantly more on linux to manage napi, through
# wakeups, softirqs and under higher network load, a seperate
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error: "seperate" should be "separate".

Suggested change
# wakeups, softirqs and under higher network load, a seperate
# wakeups, softirqs and under higher network load, a separate

Copilot uses AI. Check for mistakes.
# ksoftirqd thread which linux creates and manages.
#
# Please note that even in SKB mode or copy mode, "prefbusy"
# poll mode should work and be effective.
#
# "prefbusy" mode is the recommended choice of mode,
# as this will also automatically fallback to "softirq"
# mode if preferred busy polling is not available or
# the right choice for whatever reason (e.g. on an older
# kernel). A warning will be emitted if this fallback is made.
#
# On Intel's 100Gbps NIC ice driver it is reccommended to use
# "softirq" mode due to it not being able to support "prefbusy"
# mode, however on Mellanox's mlx5 it's well supported.
poll_mode = "softirq"

# AF_XDP socket configuration options which will eventually
# be moved to being fixed constants prior to the merge of
# prefbusy-poll-mode into main.
busy_poll_usecs = 100
gro_flush_timeout_nanos = 5000000

# This is the minimum time between napi polls if in prefbusy
# mode. This is important for protecting against a livelock
# scenario inwhich Firedancer is not given enough time in
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error: "inwhich" should be "in which" (two words).

Suggested change
# scenario inwhich Firedancer is not given enough time in
# scenario in which Firedancer is not given enough time in

Copilot uses AI. Check for mistakes.
# userspace to do work.
#
# This is a protective mechanism against bugs as well
# as to ensure even in a low RX but high TX traffic scenario,
# TX is still given enough time to do work or else
# napi is polled whenever the xsk RX queue is empty
# which could starve userspace TX work in the edge
# case there is significantly more TX than RX traffic.
lwr_prefbusy_poll_timeout_micros = 5

# This is the maximum time between napi polls if in prefbusy
# mode. This is to call a napi poll in the case the normal
# prefbusy napi poll scheduling has stalled, napi polls
# can often resolve queue stalls so this increases robustness.
upr_prefbusy_poll_timeout_micros = 150

# XDP uses metadata queues shared across the kernel and
# userspace to relay events about incoming and outgoing packets.
# This setting defines the number of entries in these metadata
Expand Down
1 change: 1 addition & 0 deletions src/app/fdctl/main.c
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ configure_stage_t * STAGES[] = {
&fd_cfg_stage_ethtool_channels,
&fd_cfg_stage_ethtool_offloads,
&fd_cfg_stage_ethtool_loopback,
&fd_cfg_stage_sysfs_poll,
NULL,
};

Expand Down
2 changes: 2 additions & 0 deletions src/app/fddev/main.h
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ extern configure_stage_t fd_cfg_stage_kill;
extern configure_stage_t fd_cfg_stage_genesis;
extern configure_stage_t fd_cfg_stage_keys;
extern configure_stage_t fd_cfg_stage_blockstore;
extern configure_stage_t fd_cfg_stage_sysfs_poll;

configure_stage_t * STAGES[] = {
&fd_cfg_stage_kill,
Expand All @@ -46,6 +47,7 @@ configure_stage_t * STAGES[] = {
&fd_cfg_stage_keys,
&fd_cfg_stage_genesis,
&fd_cfg_stage_blockstore,
&fd_cfg_stage_sysfs_poll,
NULL,
};

Expand Down
46 changes: 46 additions & 0 deletions src/app/firedancer/config/default.toml
Original file line number Diff line number Diff line change
Expand Up @@ -1065,6 +1065,52 @@ telemetry = true
# "operation not supported".
xdp_zero_copy = false

# This option moves the management of napi including
# when to poll as well as the poll budget, into userspace
# if in "prefbusy" mode. The fallback is "softirq" mode,
# which relies significantly more on linux to manage napi, through
# wakeups, softirqs and under higher network load, a seperate
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error: "seperate" should be "separate".

Suggested change
# wakeups, softirqs and under higher network load, a seperate
# wakeups, softirqs and under higher network load, a separate

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'seperate' to 'separate'.

Suggested change
# wakeups, softirqs and under higher network load, a seperate
# wakeups, softirqs and under higher network load, a separate

Copilot uses AI. Check for mistakes.
# ksoftirqd thread which linux creates and manages.
#
# Please note that even in SKB mode or copy mode, "prefbusy"
# poll mode should work and be effective.
#
# "prefbusy" mode is the recommended choice of mode,
# as this will also automatically fallback to "softirq"
# mode if preferred busy polling is not available or
# the right choice for whatever reason (e.g. on an older
# kernel). A warning will be emitted if this fallback is made.
#
# On Intel's 100Gbps NIC ice driver it is reccommended to use
# "softirq" mode due to it not being able to support "prefbusy"
# mode, however on Mellanox's mlx5 it's well supported.
poll_mode = "softirq"

# AF_XDP socket configuration options which will eventually
# be moved to being fixed constants prior to the merge of
# prefbusy-poll-mode into main.
busy_poll_usecs = 100
gro_flush_timeout_nanos = 5000000

# This is the minimum time between napi polls if in prefbusy
# mode. This is important for protecting against a livelock
# scenario inwhich Firedancer is not given enough time in
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error: "inwhich" should be "in which" (two words).

Suggested change
# scenario inwhich Firedancer is not given enough time in
# scenario in which Firedancer is not given enough time in

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spacing: 'inwhich' should be 'in which'.

Suggested change
# scenario inwhich Firedancer is not given enough time in
# scenario in which Firedancer is not given enough time in

Copilot uses AI. Check for mistakes.
# userspace to do work.
#
# This is a protective mechanism against bugs as well
# as to ensure even in a low RX but high TX traffic scenario,
# TX is still given enough time to do work or else
# napi is polled whenever the xsk RX queue is empty
# which could starve userspace TX work in the edge
# case there is significantly more TX than RX traffic.
lwr_prefbusy_poll_timeout_micros = 5

# This is the maximum time between napi polls if in prefbusy
# mode. This is to call a napi poll in the case the normal
# prefbusy napi poll scheduling has stalled, napi polls
# can often resolve queue stalls so this increases robustness.
upr_prefbusy_poll_timeout_micros = 150

# XDP uses metadata queues shared across the kernel and
# userspace to relay events about incoming and outgoing packets.
# This setting defines the number of entries in these metadata
Expand Down
1 change: 1 addition & 0 deletions src/app/firedancer/main.c
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ configure_stage_t * STAGES[] = {
&fd_cfg_stage_ethtool_loopback,
&fd_cfg_stage_snapshots,
&fd_cfg_stage_accdb,
&fd_cfg_stage_sysfs_poll,
NULL,
};

Expand Down
1 change: 1 addition & 0 deletions src/app/shared/Local.mk
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ $(call add-objs,commands/configure/fd_ethtool_ioctl,fdctl_shared)
$(call add-objs,commands/configure/hugetlbfs,fdctl_shared)
$(call add-objs,commands/configure/hyperthreads,fdctl_shared)
$(call add-objs,commands/configure/sysctl,fdctl_shared)
$(call add-objs,commands/configure/sysfs-poll,fdctl_shared)
$(call add-objs,commands/configure/snapshots,fdctl_shared)
$(call add-objs,commands/monitor/monitor commands/monitor/helper,fdctl_shared)
$(call add-objs,commands/watch/watch,fdctl_shared)
Expand Down
1 change: 1 addition & 0 deletions src/app/shared/commands/configure/configure.h
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ extern configure_stage_t fd_cfg_stage_bonding;
extern configure_stage_t fd_cfg_stage_ethtool_channels;
extern configure_stage_t fd_cfg_stage_ethtool_offloads;
extern configure_stage_t fd_cfg_stage_ethtool_loopback;
extern configure_stage_t fd_cfg_stage_sysfs_poll;
extern configure_stage_t fd_cfg_stage_snapshots;

extern configure_stage_t * STAGES[];
Expand Down
83 changes: 83 additions & 0 deletions src/app/shared/commands/configure/sysfs-poll.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
/* This stage configures the OS to support effective preferred busy
polling, allowing for significantly improved network stack (XDP)
performance if enabled. */

#include "configure.h"

#define NAME "sysfs-poll"

#include "../../../platform/fd_file_util.h"

#include <errno.h>
#include <stdio.h>
#include <unistd.h> /* access */
#include <linux/capability.h>

#define VERY_HIGH_VAL 1000000U

static char const setting_napi_defer_hard_irqs[] = "napi_defer_hard_irqs";

static char const setting_gro_flush_timeout[] = "gro_flush_timeout";

static int
enabled( config_t const * config ) {
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential null pointer dereference: strcmp is called on config->net.xdp.poll_mode without checking if it's NULL first. If poll_mode is not set in the configuration, this could cause a crash. Consider adding a null check before the strcmp call.

Suggested change
enabled( config_t const * config ) {
enabled( config_t const * config ) {
if( !config->net.xdp.poll_mode )
return 0;

Copilot uses AI. Check for mistakes.
return !strcmp( config->net.xdp.poll_mode, "prefbusy" );
}

static void
init_perm ( fd_cap_chk_t * chk,
config_t const * config FD_PARAM_UNUSED ) {
fd_cap_chk_cap( chk, NAME, CAP_NET_ADMIN, "configure preferred busy polling via `/sys/class/net/*/{napi_defer_hard_irqs, gro_flush_timeout}`" );
}

static void
sysfs_net_set( char const * device,
char const * setting,
ulong value ) {
char path[ PATH_MAX ];
fd_cstr_printf_check( path, PATH_MAX, NULL, "/sys/class/net/%s/%s", device, setting );
FD_LOG_NOTICE(( "RUN: `echo \"%lu\" > %s`", value, path ));
fd_file_util_write_uint( path, (uint)value );
}

static void
init( config_t const * config ) {
sysfs_net_set( config->net.interface, setting_napi_defer_hard_irqs, VERY_HIGH_VAL );
sysfs_net_set( config->net.interface, setting_gro_flush_timeout, config->net.xdp.gro_flush_timeout_nanos );
}

static int
fini( config_t const * config,
int pre_init FD_PARAM_UNUSED ) {
sysfs_net_set( config->net.interface, setting_napi_defer_hard_irqs, 0U );
sysfs_net_set( config->net.interface, setting_gro_flush_timeout, 0U );
return 1;
}

static configure_result_t
check( config_t const * config,
int check_type FD_PARAM_UNUSED ) {
char path[ PATH_MAX ];
uint value;
fd_cstr_printf_check( path, PATH_MAX, NULL, "/sys/class/net/%s/%s", config->net.interface, setting_napi_defer_hard_irqs );
if( fd_file_util_read_uint( path, &value ) || value < VERY_HIGH_VAL ) {
NOT_CONFIGURED("Setting napi_defer_hard_irqs failed.");
}

fd_cstr_printf_check( path, PATH_MAX, NULL, "/sys/class/net/%s/%s", config->net.interface, setting_gro_flush_timeout );
if( fd_file_util_read_uint( path, &value ) || value != config->net.xdp.gro_flush_timeout_nanos ) {
NOT_CONFIGURED("Setting gro_flush_timeout failed.");
}

CONFIGURE_OK();
}

configure_stage_t fd_cfg_stage_sysfs_poll = {
.name = NAME,
.enabled = enabled,
.init_perm = init_perm,
.fini_perm = init_perm,
.init = init,
.fini = fini,
.check = check,
};
6 changes: 6 additions & 0 deletions src/app/shared/fd_config.h
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,12 @@ struct fd_config_net {
struct {
char xdp_mode[ 8 ];
int xdp_zero_copy;

Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing whitespace: Line 204 has trailing whitespace at the end. This should be removed to maintain code cleanliness.

Suggested change

Copilot uses AI. Check for mistakes.
char poll_mode[ 16 ]; /* "prefbusy" or "softirq" */
uint busy_poll_usecs;
ulong gro_flush_timeout_nanos;
uint lwr_prefbusy_poll_timeout_micros;
uint upr_prefbusy_poll_timeout_micros;

uint xdp_rx_queue_size;
uint xdp_tx_queue_size;
Expand Down
5 changes: 5 additions & 0 deletions src/app/shared/fd_config_parse.c
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,11 @@ fd_config_extract_pod( uchar * pod,
CFG_POP ( uint, net.ingress_buffer_size );
CFG_POP ( cstr, net.xdp.xdp_mode );
CFG_POP ( bool, net.xdp.xdp_zero_copy );
CFG_POP ( cstr, net.xdp.poll_mode );
CFG_POP ( uint, net.xdp.busy_poll_usecs );
CFG_POP ( ulong, net.xdp.gro_flush_timeout_nanos );
CFG_POP ( uint, net.xdp.lwr_prefbusy_poll_timeout_micros );
CFG_POP ( uint, net.xdp.upr_prefbusy_poll_timeout_micros );
Comment on lines +187 to +191
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New XDP polling config fields are extracted here, but fd_config_validate() currently doesn't validate net.xdp.poll_mode (enumeration) or the new timeout ranges. This can lead to surprising runtime behavior (e.g. silent fallback) or invalid sysfs/socket settings. Consider adding explicit validation for allowed poll_mode values ("prefbusy"/"softirq") and sane bounds for the new numeric fields.

Copilot uses AI. Check for mistakes.
CFG_POP ( uint, net.xdp.xdp_rx_queue_size );
CFG_POP ( uint, net.xdp.xdp_tx_queue_size );
CFG_POP ( uint, net.xdp.flush_timeout_micros );
Expand Down
1 change: 1 addition & 0 deletions src/app/shared_dev/commands/pktgen/pktgen.c
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,7 @@ pktgen_cmd_fn( args_t * args FD_PARAM_UNUSED,
configure_stage( &fd_cfg_stage_bonding, CONFIGURE_CMD_INIT, config );
configure_stage( &fd_cfg_stage_ethtool_channels, CONFIGURE_CMD_INIT, config );
configure_stage( &fd_cfg_stage_ethtool_offloads, CONFIGURE_CMD_INIT, config );
configure_stage( &fd_cfg_stage_sysfs_poll, CONFIGURE_CMD_INIT, config );

fdctl_check_configure( config );
/* FIXME this allocates lots of memory unnecessarily */
Expand Down
1 change: 1 addition & 0 deletions src/app/shared_dev/commands/udpecho/udpecho.c
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,7 @@ udpecho_cmd_fn( args_t * args,
configure_stage( &fd_cfg_stage_ethtool_channels, CONFIGURE_CMD_INIT, config );
configure_stage( &fd_cfg_stage_ethtool_offloads, CONFIGURE_CMD_INIT, config );
configure_stage( &fd_cfg_stage_ethtool_loopback, CONFIGURE_CMD_INIT, config );
configure_stage( &fd_cfg_stage_sysfs_poll, CONFIGURE_CMD_INIT, config );

fdctl_check_configure( config );
/* FIXME this allocates lots of memory unnecessarily */
Expand Down
8 changes: 8 additions & 0 deletions src/disco/net/fd_net_tile_topo.c
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,14 @@ setup_xdp_tile( fd_topo_t * topo,
tile->xdp.zero_copy = net_cfg->xdp.xdp_zero_copy;
fd_cstr_ncpy( tile->xdp.xdp_mode, net_cfg->xdp.xdp_mode, sizeof(tile->xdp.xdp_mode) );

fd_cstr_ncpy( tile->xdp.poll_mode, net_cfg->xdp.poll_mode, sizeof(tile->xdp.poll_mode) );

tile->xdp.busy_poll_usecs = net_cfg->xdp.busy_poll_usecs;
tile->xdp.gro_flush_timeout_nanos = net_cfg->xdp.gro_flush_timeout_nanos;

tile->xdp.lwr_prefbusy_poll_timeout_ns = (long)net_cfg->xdp.lwr_prefbusy_poll_timeout_micros * 1000L;
tile->xdp.upr_prefbusy_poll_timeout_ns = (long)net_cfg->xdp.upr_prefbusy_poll_timeout_micros * 1000L;
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing validation: There's no check to ensure that lwr_prefbusy_poll_timeout_micros is less than upr_prefbusy_poll_timeout_micros. If the lower timeout is greater than or equal to the upper timeout, the logic in net_prefbusy_poll_ready (lines 1156-1157 in fd_xdp_tile.c) may not work as intended. Consider adding validation to ensure lwr < upr.

Suggested change
tile->xdp.upr_prefbusy_poll_timeout_ns = (long)net_cfg->xdp.upr_prefbusy_poll_timeout_micros * 1000L;
tile->xdp.upr_prefbusy_poll_timeout_ns = (long)net_cfg->xdp.upr_prefbusy_poll_timeout_micros * 1000L;
if( tile->xdp.lwr_prefbusy_poll_timeout_ns >= tile->xdp.upr_prefbusy_poll_timeout_ns ) {
tile->xdp.upr_prefbusy_poll_timeout_ns = tile->xdp.lwr_prefbusy_poll_timeout_ns + 1000L;
}

Copilot uses AI. Check for mistakes.

tile->xdp.net.umem_dcache_obj_id = umem_obj->id;
tile->xdp.netdev_dbl_buf_obj_id = netlink_tile->netlink.netdev_dbl_buf_obj_id;
tile->xdp.fib4_main_obj_id = netlink_tile->netlink.fib4_main_obj_id;
Expand Down
Loading