-
Notifications
You must be signed in to change notification settings - Fork 385
Description
The MT7925 driver experiences MCU command timeouts and firmware hangs during WiFi 7 MLO (Multi-Link Operation) roaming attempts. The firmware becomes unresponsive and requires a reset to recover.
This issue is partially mitigated by recent patches I've submitted (PR #1029, #1030, #1031, #1032, #1033, #1034) which prevent kernel panics and deadlocks, but the underlying firmware issue remains.
Hardware
- WiFi Card: MediaTek MT7925 (RZ717)
- System: Framework Desktop (AMD Ryzen AI Max 300)
- Firmware: WM Build 20250721232943
Kernel Version
6.18.2 with mt7925 patches applied
Steps to Reproduce
- Connect to WiFi 7 AP with MLO enabled (2.4GHz + 5GHz + 6GHz)
- Establish MLO connection on multiple links
- Trigger roaming (signal degradation,
wpa_cli roam, or natural roaming) - Observe firmware timeout and recovery
Log Output
Jan 01 00:51:21 kernel: wlp192s0: disconnect from AP d8:b3:70:f8:9e:7b for new auth to d8:b3:70:f8:89:4f
Jan 01 00:51:21 kernel: wlp192s0: failed to remove key (1, ff:ff:ff:ff:ff:ff) from hardware (-22)
Jan 01 00:51:21 kernel: wlp192s0: failed to remove key (4, ff:ff:ff:ff:ff:ff) from hardware (-22)
Jan 01 00:51:22 kernel: wlp192s0: authenticated
Jan 01 00:51:22 wpa_supplicant: nl80211: kernel reports: Error fetching BSS for link
Jan 01 00:51:22 wpa_supplicant: wlp192s0: SME: Association request to the driver failed
Jan 01 00:51:25 kernel: wlp192s0: d8:b3:70:f8:9e:7b rejected association temporarily; comeback duration 1000 TU
Jan 01 00:51:26 kernel: wlp192s0: association with d8:b3:70:f8:9e:7b timed out
Jan 01 00:51:29 kernel: mt7925e 0000:c0:00.0: Message 00020002 (seq 12) timeout
Jan 01 00:51:35 kernel: mt7925e 0000:c0:00.0: Message 00020003 (seq 13) timeout
Jan 01 00:51:38 kernel: mt7925e 0000:c0:00.0: Message 00020002 (seq 14) timeout
Jan 01 00:51:41 kernel: mt7925e 0000:c0:00.0: Message 00020002 (seq 15) timeout
Jan 01 00:51:44 kernel: mt7925e 0000:c0:00.0: Message 00020001 (seq 1) timeout
Jan 01 00:51:44 kernel: mt7925e 0000:c0:00.0: HW/SW Version: 0x8a108a10, Build Time: 20250721232852a
Jan 01 00:51:44 kernel: mt7925e 0000:c0:00.0: WM Firmware Version: ____000000, Build Time: 20250721232943
wpa_supplicant also reports:
nl80211: kernel reports: link ID must for MLO group key
nl80211: kernel reports: link ID must for MLO group key
nl80211: kernel reports: link ID must for MLO group key
Analysis
The failure cascade involves three issues:
1. MLO Group Key Removal Returns -EINVAL
When removing broadcast keys during roaming, mt7925_set_key() returns -EINVAL (-22). The wpa_supplicant message "link ID must for MLO group key" suggests the key removal path doesn't properly handle MLO link IDs for group keys.
2. BSS Info Fetch Failure
The driver fails to provide BSS information for the target link during association, causing nl80211: Error fetching BSS for link. This may be a race condition or missing BSS caching in MLO paths.
3. Firmware Becomes Unresponsive
After the failed roaming sequence, firmware stops responding to MCU commands:
00020002- LikelyMCU_UNI_CMD_BSS_INFOorMCU_UNI_CMD_DEV_INFO00020003- Related BSS/device command00020001-MCU_UNI_CMD_STAREC
The driver correctly detects the timeout and triggers firmware reload, which succeeds. Without my patches, this would cause a kernel panic or system deadlock.
Impact
- With patches: ~30 second WiFi outage, automatic recovery
- Without patches: Kernel panic or system deadlock requiring hard reboot
Suggested Investigation Areas
mt7925_set_key()- MLO group key handling for broadcast addresses- BSS info caching - Ensure BSS info available during MLO roaming
- MCU command handling - What causes firmware to become unresponsive after failed roaming?
- Firmware - MediaTek firmware team should investigate the state machine
Related
- Framework Community: https://community.frame.work/t/issues-with-mediatek-mt7925-rz717-wi-fi-card/75815
- LKML patches: https://lore.kernel.org/linux-wireless/?q=mt7925+zbowling
- My test repo: https://github.com/zbowling/mt7925