-
Notifications
You must be signed in to change notification settings - Fork 2.1k
sys/net/link_layer/ieee802154/submac: move ack transmission #21973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
mguetschow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this!
Using the radio without The test mimics the |
|
For my use-case it makes more sense to use SubMAC directly rather than hacking 802.15.4 features into the netdev. I am told, that netdev should abstract various media types into a uniform API, therefore adding 802.15.4 specific stuff is questionable. This also allows for less overhead (netdev will typically run in a separate thread). |
|
I get the point that
This is still an assumption to me. In my opinion it should be coded in a way that a frame is read at only one place (submac then and no longer netdev_ieee802154_submac). Because the |
Maybe @jia200x as the original author of the radio HAL has some comments on this? In any case I would expect documentation in https://doc.riot-os.org/structieee802154__radio__ops.html#a99394547c65c7872bff808bba1d69c46 to establish a clear API contract of when calling this function returns correct results. Edit: it actually states
which probably translates to "the data could be overwritten" |
|
Alright. I just implemented an |
Many of us require to work with the radios directly without In cases where a component is certain that it's interacting with an IEEE 802.15.4 radio (such as with Zigbee, openDSME, OpenWSN, etc.), the This challenge has existed for quite some time, at least since the first attempts to port OpenWSN, and the rework was designed to allow us to use netdev as a "network interface" and not as a "radio interface." For more context, check out this year's RIOT Summit presentation on the Radio HAL and SubMAC
I agree that the test could be improved, but it was intended as an entry point for testing the Radio HAL. We've encountered cases where broken network tests were actually due to incomplete or inconsistent Radio HAL implementations. Testing the Radio HAL using the netdev API doesn't allow us to test key aspects like compliance with the Abstract State Machine, timing issues, and so on. In general, I’d suggest we avoid adding more workarounds on top of netdev. It has taken significant effort to create a more consistent experience across radios, and this has required careful attention to the layering. Continuing to refine this will help maintain long-term stability and flexibility. |
Based on previous experience working with other radios, it’s difficult to guarantee that this will work across all radios. In many cases, modifying states inside the read function may violate the assumptions of the Abstract State Machine. For this issue, I see two potential solutions: First solution (likely simpler): Allow peeking into the framebuffer by modifying the Radio HAL API. This would allow the upper layer (e.g., SubMAC) to access the SQN or any other frame section. We could make this a requirement for radios that don’t support AutoACK, such as the AT86RF2xx in Basic Mode or NRF52, which usually offer a mechanism to peek into the framebuffer. These radios typically expect the driver to handle the ACK logic. Second solution (a bit more controversial): Modify the drivers or SubMAC to handle the framebuffer directly, potentially changing the receive logic from a "pull" model (calling a read function) to a "push" model (subscribing to a framebuffer). This would decouple the state-changing logic from the read function and allow for better control. |
| (submac->rx_buf[0] & IEEE802154_FCF_ACK_REQ) && | ||
| (ieee802154_radio_get_frame_filter_mode(dev, &mode) < 0 || | ||
| mode == IEEE802154_FILTER_ACCEPT)) { | ||
| /* An ACK is sent synchronously to prevent that the upper layer (IPv6) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, the SubMAC is considered asynchronous, and several of its components may yield before completing. Therefore, I don't think it is necessary to send the ACK synchronously. As @fabian18 noted, doing so can sometimes result in unexpected behaviors, such as assertions or race conditions.
The fact that the radio may be busy is not an issue, since the network interface uses gnrc_netif_pktq, which stores packets until the MAC finishes.
I suggest handling this in a structured way by adding an additional state to the SubMAC state machine, rather than introducing a sub-state within an existing state.
PS: Getting #21578 first could likely simplify such logic even more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now added a state for transmitting ACKs, which is entered on IEEE802154_FSM_EV_RX_DONE after the ACK is transmitted via _handle_fsm_ev_request_tx and submac->cb->rx_done was called.
The state will process IEEE802154_FSM_EV_TX_DONE events just like IEEE802154_FSM_STATE_TX, else it will return IEEE802154_FSM_STATE_INVALID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sending ACK asynchronously through ieee802154_submac_bh_request does not work.
2026-01-24 00:20:21,913 # IEEE802154 submac: ieee802154_send(): Sending aborted, current state is TX_ACK
2026-01-24 00:20:21,918 # IEEE802154 submac: ieee802154_send(): Sending aborted, current state is TX_ACK
2026-01-24 00:20:21,923 # IEEE802154 submac: ieee802154_send(): Sending aborted, current state is TX_ACK
2026-01-24 00:20:21,929 # IEEE802154 submac: ieee802154_send(): Sending aborted, current state is TX_ACK
2026-01-24 00:20:21,933 # IEEE802154 submac: ieee802154_send(): Sending aborted, current state is TX_ACK
2026-01-24 00:20:21,938 # IEEE802154 submac: ieee802154_send(): Sending aborted, current state is TX_ACK
.... forever ....
needs investigation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The synchronous ACK I did because, as I wrote in a comment, icmpv6 sent the next ping before the ACK, I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could work but needs further testing.
fabian.huessler@ml-pa.loc@MLPA-NB119:~/rdpd-dft-rtos/mlpa-rtd-sdk/ext/RIOT$ git diff
diff --git a/sys/net/link_layer/ieee802154/submac.c b/sys/net/link_layer/ieee802154/submac.c
index d17fb9bbea..4d77751362 100644
--- a/sys/net/link_layer/ieee802154/submac.c
+++ b/sys/net/link_layer/ieee802154/submac.c
@@ -25,7 +25,7 @@
#include "kernel_defines.h"
#include "errno.h"
-#define ENABLE_DEBUG 0
+#define ENABLE_DEBUG 1
#include "debug.h"
#define CSMA_SENDER_BACKOFF_PERIOD_UNIT_US (320U)
@@ -102,6 +102,10 @@ static ieee802154_fsm_state_t _tx_end(ieee802154_submac_t *submac, int status,
if (submac->fsm_state != IEEE802154_FSM_STATE_TX_ACK) {
submac->cb->tx_done(submac, status, info);
}
+ else {
+ DEBUG("IEEE802154 submac: ACK transmission done\n");
+ submac->cb->rx_done(submac);
+ }
return IEEE802154_FSM_STATE_IDLE;
}
@@ -195,7 +199,7 @@ static ieee802154_fsm_state_t _fsm_state_rx(ieee802154_submac_t *submac, ieee802
case IEEE802154_FSM_EV_RX_DONE:
while (ieee802154_radio_set_idle(dev, false) < 0) {}
submac->rx_len = ieee802154_radio_len(dev);
- assert(submac->rx_len < IEEE802154_FRAME_LEN_MAX);
+ assert(submac->rx_len <= IEEE802154_FRAME_LEN_MAX);
res = ieee802154_radio_read(dev, submac->rx_buf, submac->rx_len, &submac->rx_info);
assert(res == (int)submac->rx_len);
/* Make sure it's not an ACK frame */
@@ -205,15 +209,14 @@ static ieee802154_fsm_state_t _fsm_state_rx(ieee802154_submac_t *submac, ieee802
ieee802154_filter_mode_t mode;
if ((submac->rx_buf[0] & IEEE802154_FCF_TYPE_MASK) == IEEE802154_FCF_TYPE_DATA &&
(submac->rx_buf[0] & IEEE802154_FCF_ACK_REQ) &&
- (ieee802154_radio_get_frame_filter_mode(dev, &mode) < 0 ||
- mode == IEEE802154_FILTER_ACCEPT)) {
- res = _handle_fsm_ev_tx_ack(submac, ieee802154_get_seq(submac->rx_buf));
- submac->cb->rx_done(submac);
- if (res < 0) {
- DEBUG("IEEE802154 submac: Sending ACK failed with status: %d\n", res);
- return IEEE802154_FSM_STATE_IDLE;
- }
+ (ieee802154_radio_get_frame_filter_mode(dev, &mode) < 0 || mode == IEEE802154_FILTER_ACCEPT)) {
+ if ((res = _handle_fsm_ev_tx_ack(submac, ieee802154_get_seq(submac->rx_buf))) < 0) {
+ DEBUG("IEEE802154 submac: Sending ACK failed with status: %d\n", res);
+ }
+ else {
+ /* Do not call rx_done yet. ACK must be sent first */
return IEEE802154_FSM_STATE_TX_ACK;
+ }
}
}
submac->cb->rx_done(submac);
@@ -291,6 +294,7 @@ static ieee802154_fsm_state_t _fsm_state_prepare(ieee802154_submac_t *submac,
{
ieee802154_dev_t *dev = &submac->dev;
+ ieee802154_fsm_state_t tx_state = IEEE802154_FSM_STATE_INVALID;
switch (ev) {
case IEEE802154_FSM_EV_BH:
if (ftype == IEEE802154_FCF_TYPE_DATA
@@ -305,14 +309,16 @@ static ieee802154_fsm_state_t _fsm_state_prepare(ieee802154_submac_t *submac,
if (curr_be < submac->be.max) {
submac->backoff_mask = (submac->backoff_mask << 1) | 1;
}
+ tx_state = IEEE802154_FSM_STATE_TX;
}
else if (ftype == IEEE802154_FCF_TYPE_ACK) {
/* no backoff for ACK frames but wait for SIFSPeriod */
ztimer_sleep(ZTIMER_USEC, submac->sifs_period_us);
+ tx_state = IEEE802154_FSM_STATE_TX_ACK;
}
while (ieee802154_radio_request_transmit(dev) == -EBUSY) {}
- return IEEE802154_FSM_STATE_TX;
+ return tx_state;
case IEEE802154_FSM_EV_RX_DONE:
case IEEE802154_FSM_EV_CRC_ERROR:
/* This might happen in case there's a race condition between ACK_TIMEOUT
@@ -448,22 +454,14 @@ static ieee802154_fsm_state_t _fsm_state_wait_for_ack(ieee802154_submac_t *subma
static ieee802154_fsm_state_t _fsm_state_tx_ack(ieee802154_submac_t *submac,
ieee802154_fsm_ev_t ev)
{
- ieee802154_tx_info_t info;
- int res;
-
- /* This is required to prevent unused variable warnings */
- (void) res;
-
switch (ev) {
+ case IEEE802154_FSM_EV_BH:
+ return _fsm_state_prepare(submac, ev, IEEE802154_FCF_TYPE_ACK);
case IEEE802154_FSM_EV_TX_DONE:
- if ((res = ieee802154_radio_confirm_transmit(&submac->dev, &info)) >= 0) {
- return _fsm_state_tx_process_tx_done(submac, &info);
- }
- break;
+ return _fsm_state_tx(submac, ev);
default:
break;
}
-
return IEEE802154_FSM_STATE_INVALID;
}This is transmitting the asynchronously. The key is to not call rx_done before the ACK is sent.
States are TX_ACK -- (BH) --> TX_ACK -- (TX_DONE) --> rx_done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if it is necessary to call rx_done as late as when the ack transmission is done.
If you were to set the fsm_state to tx_ack before calling rx_done in the else case,
icmpv6 wouldn't be able to transmit the next packets because the send function would return busy since
the submac isn't in rx or idle state but this is only my assumption.
The benefit would be that you could read the rx buffer while the radio
transmits the ack
| * @pre this function MUST be called either inside @ref ieee802154_submac_cb_t::rx_done | ||
| * or in SLEEP state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure you can just remove this precondition? What is the initial value of it before receiving any frame? Isn't the buffer cleared at any point in time? (Should it maybe?)
Also, does it make sense to retrieve a frame length in any other state than RX and SLEEP?
(same below, obviously)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The preconditions were present because these functions accessed the radio buffer directly. Now that we have a separate rx-buffer we can access it anytime. The initial values are set to zero at ieee802154_submac_init(). The buffer only gets overwritten, as soon as a new frame gets received.
We could restrict access in certain states and clear it after the first read, but this would create unnecessary overhead in my opinion.
|
I've added an if-statement to not call |
| info->rssi = submac->rx_info.rssi; | ||
| info->lqi = submac->rx_info.lqi; | ||
| } | ||
| memcpy(buf, submac->rx_buf, submac->rx_len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a caller passes buf = NULL, this will likely result in a segmentation fault
| * This functions reads the received PSDU from the device (excluding FCS) | ||
| * | ||
| * @pre this function MUST be called either inside @ref ieee802154_submac_cb_t::rx_done | ||
| * or in SLEEP state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a good idea to remove the precondition here, because the
submac isn't thread-safe. Removing the precondition could lead to reading the
frame while the RX interrupt is simultaneously overwriting the buffer, which
creates a race condition
| DEBUG("IEEE802154 submac: Sending ACK failed with status: %d\n", res); | ||
| return IEEE802154_FSM_STATE_IDLE; | ||
| } | ||
| return IEEE802154_FSM_STATE_TX_ACK; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right state transition.
Since you prepared the ACK frame in _handle_fsm_ev_tx_ack() and requested
the bh process to transmit it, I believe the state machine should
first transition to the PREPARE state.
If you go directly to TX_ACK state, the frame won't be transmitted because
the TX_ACK state doesn't handle the EV_BH event bc it ignores it.
This means the prepared ACK would never actually be sent
| case IEEE802154_FSM_EV_RX_DONE: | ||
| while (ieee802154_radio_set_idle(dev, false) < 0) {} | ||
| submac->rx_len = ieee802154_radio_len(dev); | ||
| assert(submac->rx_len < IEEE802154_FRAME_LEN_MAX); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<= 🤔
| (submac->rx_buf[0] & IEEE802154_FCF_ACK_REQ) && | ||
| (ieee802154_radio_get_frame_filter_mode(dev, &mode) < 0 || | ||
| mode == IEEE802154_FILTER_ACCEPT)) { | ||
| /* An ACK is sent synchronously to prevent that the upper layer (IPv6) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sending ACK asynchronously through ieee802154_submac_bh_request does not work.
2026-01-24 00:20:21,913 # IEEE802154 submac: ieee802154_send(): Sending aborted, current state is TX_ACK
2026-01-24 00:20:21,918 # IEEE802154 submac: ieee802154_send(): Sending aborted, current state is TX_ACK
2026-01-24 00:20:21,923 # IEEE802154 submac: ieee802154_send(): Sending aborted, current state is TX_ACK
2026-01-24 00:20:21,929 # IEEE802154 submac: ieee802154_send(): Sending aborted, current state is TX_ACK
2026-01-24 00:20:21,933 # IEEE802154 submac: ieee802154_send(): Sending aborted, current state is TX_ACK
2026-01-24 00:20:21,938 # IEEE802154 submac: ieee802154_send(): Sending aborted, current state is TX_ACK
.... forever ....
needs investigation
| (submac->rx_buf[0] & IEEE802154_FCF_ACK_REQ) && | ||
| (ieee802154_radio_get_frame_filter_mode(dev, &mode) < 0 || | ||
| mode == IEEE802154_FILTER_ACCEPT)) { | ||
| /* An ACK is sent synchronously to prevent that the upper layer (IPv6) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The synchronous ACK I did because, as I wrote in a comment, icmpv6 sent the next ping before the ACK, I believe.
Contribution description
The transmission of IEEE802.15.4 Acknowledgements should be handled by the SubMAC-Layer itself, rather than the SubMAC-netdev. This will allow applications using SubMAC directly, to reduce redundant code.
Testing procedure
Flash
tests/net/ieee802154_submaconto two devices which do not contain theIEEE802154_CAP_AUTO_ACKcapability.Send a message using
txtsndfrom one device to the other. Notice, thatNo ACKwill appear in the terminal, because SubMAC does not send Acks.Issues/PRs references
#13376
#14950