fix(runtime/smol): eliminate lost-wakeup race and mutex-poison panic by nightness · Pull Request #790 · webrtc-rs/webrtc

nightness · 2026-04-01T10:40:58Z

Summary

Two correctness bugs in the smol runtime backend:

1. `SmolJoinHandle` — mutex poison panic

detach() and abort() used .lock().unwrap() on a std::sync::Mutex. If another thread panicked while holding the lock, this would trigger a double-panic and abort the process. Fixed with .unwrap_or_else(|e| e.into_inner()) which recovers the data even from a poisoned mutex.

2. `SmolNotify` — lost-wakeup race (permanent task hang)

The previous implementation used try_lock() in notify_one() / notify_waiters(). If the lock was held by a concurrent notified() call registering itself as a waiter, the notification was silently discarded — causing any task waiting on notified() to hang permanently.

New design adds an AtomicBool "pending" flag:

notify_* sets the flag (Release) before attempting to acquire the waiter list — so even if the lock is contended, a concurrent notified() will see the flag.
notified() checks the flag (AcqRel swap) before and after acquiring the lock, so it cannot miss a notification that arrived between the two checks.

Also eliminates the duplicated AsyncNotify impl body by delegating to the SmolNotify inherent methods.

Test Plan

cargo build (tokio, default) passes
cargo build --no-default-features --features runtime-smol passes

🤖 Generated with Claude Code

Without a timeout, `resolve_host` in `gather_from_stun_server` could block indefinitely on a misconfigured or unreachable DNS server, hanging ICE gathering. STUN response timeouts are already handled by the Sans-IO StunClient retry loop (7 attempts × 300ms RTO), so no separate fix is needed for webrtc-rs#778. Closes webrtc-rs#774 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…#784) Adds tests/datachannel_video_interop.rs verifying that an RTCPeerConnection with both a video transceiver and a data channel negotiates successfully: - Offer SDP contains both m=video (active, with codecs) and m=application - Answer SDP contains both m-lines - ICE + DTLS + SCTP establish and data channel message is exchanged Key finding: the feature works correctly in master when default codecs are registered on both sides. Users must call register_default_codecs() on the MediaEngine for video m-lines to have non-rejected codecs in the offer. Closes webrtc-rs#784 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Shows that a DataChannel and a video transceiver coexist on the same PeerConnection. The key callout — prominently documented in the module doc — is that MediaEngine::register_default_codecs() must be called; without it the offer emits m=video 0 (rejected), which misleads users into thinking mixed DC+video is broken. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Re-export MulticastDnsMode from src/peer_connection/mod.rs - Add PeerConnectionBuilder::with_mdns_mode() so callers explicitly enable the multicast socket alongside with_setting_engine(); no submodule changes required - When mode != Disabled, PeerConnectionImpl::new() creates a multicast UDP socket via rtc_mdns::MulticastSocket and adds it to the driver's socket map; existing non-ICE routing forwards mDNS packets to the core - Add examples/mdns-local-peers/ demonstrating two in-process peers using QueryAndGather mode without a STUN server Closes webrtc-rs#782 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…-rs#781) - Add AsyncTcpListener + AsyncTcpStream traits to Runtime abstraction; implement for both Tokio and smol runtimes - Add tokio io-util feature for AsyncReadExt/AsyncWriteExt - Add rtc-shared direct dependency for tcp_framing utilities - PeerConnectionImpl::new() binds TCP passive listeners from with_tcp_addrs() and passes them to the driver - RTCIceGatherer emits tcptype passive host candidates for each TCP listener alongside existing UDP host candidates - PeerConnectionDriver: - Spawns an accept loop per TCP listener; registers accepted connections - Per connection: dedicated read task (RFC 4571 decode → tcp_inbound_rx) and write task (tcp_write_txs channel → RFC 4571 encode) - select! loop handles tcp_new_conn_rx and tcp_inbound_rx alongside UDP - handle_write() routes TCP packets to write channels; dials out via runtime.connect_tcp() on first packet to an unknown peer (active mode) - Add examples/ice-tcp/ demonstrating two in-process peers over TCP only Closes webrtc-rs#781 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

SmolJoinHandle: replace .lock().unwrap() with .lock() .unwrap_or_else(|e| e.into_inner()) in both detach() and abort(). Recovering from a poisoned mutex is safe in cleanup paths; panicking would cause a double-panic and abort the process. SmolNotify: rewrite to eliminate the lost-wakeup race introduced by using try_lock() in notify_one() / notify_waiters(). Previous design: try_lock() would silently discard a notification if the lock was held by a concurrent notified() call, permanently hanging any task that was registering itself as a waiter at that instant. New design uses an AtomicBool "pending" flag: - notify_*: sets the flag (Release) BEFORE attempting to acquire the waiter list, so any concurrent notified() will see the flag even if the lock is still held. - notified(): checks the flag (AcqRel swap) before AND after acquiring the lock, so it cannot miss a notification that arrived between the two checks. Also eliminates the duplicated AsyncNotify impl body by delegating to the SmolNotify inherent methods. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

nightness · 2026-04-01T18:34:56Z

Superseded by #795 which consolidates all driver robustness fixes into a single PR.

nightness · 2026-04-01T18:40:37Z

Superseded by #795 which is a properly scoped PR for these driver/runtime fixes.

nightness and others added 11 commits April 1, 2026 04:15

Merge fix/ice-gather-timeouts: DNS timeout for ICE gathering (webrtc-…

cf3c847

…rs#774)

Merge feat/datachannel-video: DataChannel + Video test and example (w…

e88ec2b

…ebrtc-rs#784)

Merge feat/mdns-async-wrapper: mDNS peer discovery in async wrapper (w…

a9a4ea0

…ebrtc-rs#782)

Merge feat/tcp-ice-async: TCP ICE candidates RFC 6544 in async wrapper (

3858c95

webrtc-rs#781)

chore: point rtc submodule to Brainwires/webrtc-rs-rtc with RFC 8829 fix

5570b53

nightness closed this Apr 1, 2026

nightness mentioned this pull request Apr 1, 2026

fix(driver): robust error handling, TCP task tracking, and smol runtime fixes #795

Open

3 tasks

nightness reopened this Apr 1, 2026

nightness closed this Apr 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(runtime/smol): eliminate lost-wakeup race and mutex-poison panic#790

fix(runtime/smol): eliminate lost-wakeup race and mutex-poison panic#790
nightness wants to merge 11 commits intowebrtc-rs:masterfrom
Brainwires:fix/smol-runtime-correctness

nightness commented Apr 1, 2026

Uh oh!

nightness commented Apr 1, 2026

Uh oh!

nightness commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

nightness commented Apr 1, 2026

Summary

1. SmolJoinHandle — mutex poison panic

2. SmolNotify — lost-wakeup race (permanent task hang)

Test Plan

Uh oh!

nightness commented Apr 1, 2026

Uh oh!

nightness commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `SmolJoinHandle` — mutex poison panic

2. `SmolNotify` — lost-wakeup race (permanent task hang)