feat: Add WiFi resilience settings by craigmillard86 · Pull Request #188 · CarlosDerSeher/snapclient

craigmillard86 · 2026-01-17T21:17:30Z

Adds compile-time configuration options for WiFi stability via idf.py menuconfig.

Lightsnapcast Player Settings

PLAYER_QUEUE_EMPTY_THRESHOLD (default 3) - Number of consecutive empty queue reads before triggering a hard resync. Provides tolerance for brief WiFi dropouts (~78ms at default). During empty reads, audio continues playing from DMA buffers while the sync algorithm maintains timing. If the threshold is exceeded, a clean hard resync is triggered to prevent audio glitches.
PLAYER_QUEUE_INSERT_TIMEOUT_MS (default 50) - Maximum time to wait when inserting audio chunks into the playback queue. Allows the queue to drain during burst packet arrivals rather than immediately dropping chunks. Helps handle
WiFi jitter where packets arrive in bursts rather than evenly spaced.

WiFi Resilience Settings

WIFI_TCP_NODELAY (default y) - Disables Nagle's algorithm on the snapserver TCP connection. Sends packets immediately rather than buffering small writes, reducing latency for time-sensitive audio data.
WIFI_RECONNECT_MIN_DELAY_MS (default 1000) - Initial delay before attempting to reconnect after connection loss. Starting point for exponential backoff.
WIFI_RECONNECT_MAX_DELAY_MS (default 30000) - Maximum reconnection delay with exponential backoff. Prevents hammering the server during extended outages. Includes random jitter to avoid multiple clients reconnecting simultaneously.

CarlosDerSeher · 2026-01-17T21:44:05Z

Why do you think we need those things? Could you elaborate a bit? Did you encounter issues related with those settings?

craigmillard86 · 2026-01-17T22:27:09Z

Why do you think we need those things? Could you elaborate a bit? Did you encounter issues related with those settings?

I've been experiencing regular audio dropouts on a congested home network. My ping times typically hover around 40ms but spike up to 1400ms during congestion peaks. When these spikes occur, the player triggers a hard resync and takes several seconds to stabilize, causing noticeable audio interruption.

The Problem

The current hardcoded values assume a stable, low-latency network:

A single empty queue read triggers immediate hard resync
Fixed sync tolerance doesn't account for network jitter
No configurable buffer headroom for absorbing latency spikes

On congested WiFi, a 1400ms spike would cause the queue to empty, triggering an aggressive resync that often overshoots, leading to repeated corrections before stabilizing.

How These Settings Help

Queue Empty Threshold Requires 3+ consecutive empty reads before resync, filtering out brief spikes
Fast Sync Tolerance Larger latency buffer (50ms default, adjustable to 100ms) absorbs jitter without resyncing
Buffer Headroom Extra queue capacity provides cushion for burst delays using spare psram
Queue Insert Timeout Prevents premature packet drops during brief congestion
Reconnect Delays Exponential backoff prevents hammering the server during network issues
TCP No Delay Optional tuning for latency vs throughput tradeoff

Results

With these settings tuned for my network (threshold=5, fast_sync=75000us, headroom=100%), the player now rides through the 1400ms spikes without triggering hard resync, maintaining stable playback where it previously would stutter and resync repeatedly.

The defaults remain conservative for good networks, but users with challenging WiFi conditions can now tune for their environment via the Advanced Settings UI.

Any thoughts much appreciated on this as want to find a solution for all.

CarlosDerSeher · 2026-01-18T05:07:39Z

Queue Empty Threshold Requires 3+ consecutive empty reads before resync, filtering out brief spikes

Queue Insert Timeout Prevents premature packet drops during brief congestion

Reconnect Delays Exponential backoff prevents hammering the server during network issues

TCP No Delay Optional tuning for latency vs throughput tradeoff

I can see how these could make sense

Fast Sync Tolerance Larger latency buffer (50ms default, adjustable to 100ms) absorbs jitter without resyncing

What's this exactly?

Buffer Headroom Extra queue capacity provides cushion for burst delays using spare psram

If you have psram shouldn't you just set a higher buffer on the server if you have a bad network?

craigmillard86 · 2026-01-18T09:11:29Z

Queue Empty Threshold Requires 3+ consecutive empty reads before resync, filtering out brief spikes

Queue Insert Timeout Prevents premature packet drops during brief congestion

Reconnect Delays Exponential backoff prevents hammering the server during network issues

TCP No Delay Optional tuning for latency vs throughput tradeoff

I can see how these could make sense

Fast Sync Tolerance Larger latency buffer (50ms default, adjustable to 100ms) absorbs jitter without resyncing

What's this exactly?

Buffer Headroom Extra queue capacity provides cushion for burst delays using spare psram

If you have psram shouldn't you just set a higher buffer on the server if you have a bad network?

Fast Sync Tolerance

I believe this controls how much timing drift the player tolerates before triggering a resync. When a chunk arrives, the player compares actual vs expected playback time. If the difference exceeds this threshold, it triggers corrective action.

With the default 50ms tolerance, a network spike that delays packets by 60ms would trigger a resync. Increasing to 100ms lets the player absorb that spike and naturally catch up as the network recovers, rather than forcing an abrupt correction.

It's essentially "how late can a packet be before we panic" - on a jittery network, being more forgiving prevents constant resync thrashing.

Buffer Headroom vs Server Buffer

They serve different purposes:

Server Buffer (latency) Sets baseline delay before playback starts. Affects ALL audio - adds fixed latency to every packet regardless of network conditions.
Buffer Headroom (client) Extra queue capacity to absorb temporary bursts without adding baseline latency. Only used when network spikes occur, otherwise sits empty.

Increasing server buffer from 1000ms to 2000ms adds 1 second of latency to all devices no matter the connection (5g, ehternet 2.4g).

Buffer headroom lets the client queue hold extra packets during a burst catchup, then drain back to normal as the network recovers. The baseline latency stays the same, but you have capacity to absorb spikes.

CarlosDerSeher · 2026-01-18T09:41:31Z

Where do you use this Fast Sync Tolerance exactly?

Essentially you assume the router/switch accumulates packets because of congestion and those are measures to address this without increasing latency. How about raspberry snapclient? I guess those devices won't have a problem because of much more ram and network stack is buffering packets anyway. But couldn't we just increase max alllowed dynamic rx buffer through menuconfig too to tackle this.

One more thing, there is another update in the Pipe, so have a look at sync rework branch. You should base those changes on that branch. I just didn't find the ti.e to merge with dev yet

luar123 · 2026-01-18T13:14:31Z

Did not look into details but if I understand correctly, you want to keep the player playing even if it is out of sync? I think so far the approach was to better hard sync when the sync is off by more than a few ms. I guess it depends on the use case, if you have one player per room you could tolerate a higher difference, but if you have a stereo pair you need to increase the server buffer.

Regarding the implementation: I would suggest to keep the lightsnapcast component more as a library with a defined interface and don't add dependencies and calls to the settingsmanager. And it seems you added blocking calls to the player task, that should be avoided. It is not really needed to change these settings on the fly, or is it? I would just apply them in init_player or start_player.

@CarlosDerSeher sync rework is merged into develop already. #180

CarlosDerSeher · 2026-01-18T16:22:19Z

sync rework is merged into develop already. #180

Thanks for the reminder. Seems I lost track since there is going on a lot currently :)

craigmillard86 · 2026-01-18T17:10:53Z

Where do you use this Fast Sync Tolerance exactly?

Essentially you assume the router/switch accumulates packets because of congestion and those are measures to address this without increasing latency. How about raspberry snapclient? I guess those devices won't have a problem because of much more ram and network stack is buffering packets anyway. But couldn't we just increase max alllowed dynamic rx buffer through menuconfig too to tackle this.

One more thing, there is another update in the Pipe, so have a look at sync rework branch. You should base those changes on that branch. I just didn't find the ti.e to merge with dev yet

Ah hadnt considered the RX buffers, will have a play with them instead, looks like i was also on an older develop with out the rework. I am reworking this now based on @luar123 comments and only implementing:

Queue Empty Threshold Requires 3+ consecutive empty reads before resync, filtering out brief spikes
Queue Insert Timeout Prevents premature packet drops during brief congestion
Reconnect Delays Exponential backoff prevents hammering the server during network issues
TCP No Delay Optional tuning for latency vs throughput tradeoff

Add configurable parameters to improve audio streaming stability on congested WiFi networks: - TCP_NODELAY: Disable Nagle's algorithm for lower latency - Queue empty hysteresis: Require consecutive empty reads before hard resync - Queue insert timeout: Configurable wait time for queue space - Fast sync tolerance: Latency buffer for sync operations - Reconnect delays: Exponential backoff with jitter for reconnection - Buffer headroom: Extra capacity for WiFi jitter tolerance All parameters are configurable via: - menuconfig (compile-time defaults) - New "Advanced Settings" Web UI tab (runtime with NVS persistence)

Remove runtime UI and NVS settings for WiFi resilience in favor of compile-time Kconfig options. This keeps lightsnapcast as a clean library without settings_manager dependency. Changes: - Remove advanced-settings.html and UI handlers - Remove WiFi resilience functions from settings_manager - Use CONFIG_PLAYER_QUEUE_* directly in player.c - Use CONFIG_WIFI_* directly in main.c - Remove unused fast_sync_latency option Remaining Kconfig options: - WIFI_TCP_NODELAY, WIFI_RECONNECT_MIN/MAX_DELAY_MS (main) - PLAYER_QUEUE_EMPTY_THRESHOLD, PLAYER_QUEUE_INSERT_TIMEOUT_MS (lightsnapcast)

craigmillard86 · 2026-01-19T00:01:43Z

So i have reworked the code to minimise what is going on here.

Although while doing this i have discovered and fixed (I believe) the root of my WiFi issues here: #191. Now unsure if these are worth adding but do provide some configuration options to improve network ressilience.

Hoerli1337 · 2026-01-24T10:34:05Z

Hey!
Can the fix also solve this problem?
Is the problem identical?
sonocotta/esp32-audio-dock#78 (comment)

I bought two Hifi-ESP32-S3 and equipped them with the Snapclient firmware.
One plays the music after about 10-20 attempts and the latency normalizes to 1-5ms during operation - when muted, it is 50-500ms.
The second one doesn't work at all and keeps restarting due to corrupt values.
The latency is always between 60-5000 ms.
Distance to the AP ~4 meters.
RSSI -71

I'm also experiencing crashes with the second one using the ESP Audio Dock firmware.
Is there an error in the code, or have I unfortunately received a partially defective device?

luar123 · 2026-01-24T19:13:56Z

Not sure if ping is a good measure. Snapclient uses a low level implementation that blocks for up to 1s if no packages are arriving.

Hoerli1337 · 2026-01-25T09:11:19Z

Not sure if ping is a good measure. Snapclient uses a low level implementation that blocks for up to 1s if no packages are arriving.

At least it's an indication that something isn't working properly here.
If the latency rises above 20-30ms for 2-3 pings, there are dropouts in the sound.

I also noticed that, depending on its mood, one ESP32 takes between 30 seconds and 2 minutes after being switched on before the sound is reproduced cleanly.

CarlosDerSeher · 2026-01-25T09:20:10Z

#191 suggests to disable power saving

CarlosDerSeher · 2026-02-06T19:15:16Z

components/lightsnapcast/player.c

          int msgWaiting = uxQueueMessagesWaiting(pcmChkQHdl);

+          // Track consecutive empty queue reads for hysteresis
+          static int consecutive_empty_count = 0;


is this really necessary? I have a feeling if you run out of samples there will be an audible offset between clients if you tolerate this.

You raise a fair point but this was to help on bad networks -

How it works: When the queue empties, the I2S DMA clock keeps running independently (looping its buffer). With
threshold=3, that's ~78ms where no fresh samples are written. When new data arrives, the player is behind by the
duration of the gap, and soft-sync (APLL/sample insertion) gradually corrects it, but during that window, clients
could be slightly out of sync.

Why it exists: On congested WiFi (my network regularly sees ping spikes to 1400ms), a single empty queue read with threshold=1 triggers a full hard resync, mute, stop I2S, reset initialSync, re-establish sync from scratch. its a
multi-second audible disruption for what might be a 26ms network hiccup that resolves on its own. This hysteresis
avoids that.

Threshold=1 gives tightest multi-client sync but is fragile on imperfect networks. Threshold=3 tolerates brief WiFi hiccups but risks ~78ms transient offset that soft-sync corrects over a few seconds. Both have audible impact, it's a question of which is less disruptive for the user's environment.

That's why it's a Kconfig setting (PLAYER_QUEUE_EMPTY_THRESHOLD, range 1-10, default 3) rather than a hardcoded change, users on clean networks can set it to 1 for tight sync, while those on congested WiFi can increase it to avoid constant resyncs.

Happy to adjust the default if you think 1 is more appropriate for the typical use case, what do you think?

multi-second audible disruption

I am wondering, the hard resync never takes that long, normally it's just a short "click" in the speakers and if you don't listen carefully and this just happens once it's almost not noticeable.

Is this a board/DAC specific problem maybe?

craigmillard86 added 2 commits January 18, 2026 17:44

craigmillard86 force-pushed the feature/wifi-resilience branch from 068c75a to 03d967b Compare January 18, 2026 18:18

craigmillard86 changed the title ~~feat: Add WiFi resilience settings with Advanced Settings UI~~ feat: Add WiFi resilience settings Jan 18, 2026

anabolyc mentioned this pull request Jan 23, 2026

High latency with Snapcast sonocotta/esp32-audio-dock#78

Open

craigmillard86 added a commit to anabolyc/esp32-snapclient that referenced this pull request Jan 25, 2026

Merge PR CarlosDerSeher#188 into test

f57fe36

CarlosDerSeher reviewed Feb 6, 2026

View reviewed changes

Comments

Conversation

craigmillard86 commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CarlosDerSeher commented Jan 17, 2026

Uh oh!

craigmillard86 commented Jan 17, 2026

Uh oh!

CarlosDerSeher commented Jan 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

craigmillard86 commented Jan 18, 2026

Uh oh!

CarlosDerSeher commented Jan 18, 2026

Uh oh!

luar123 commented Jan 18, 2026

Uh oh!

CarlosDerSeher commented Jan 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

craigmillard86 commented Jan 18, 2026

Uh oh!

craigmillard86 commented Jan 19, 2026

Uh oh!

Hoerli1337 commented Jan 24, 2026

Uh oh!

luar123 commented Jan 24, 2026

Uh oh!

Hoerli1337 commented Jan 25, 2026

Uh oh!

CarlosDerSeher commented Jan 25, 2026

Uh oh!

CarlosDerSeher Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

craigmillard86 Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

CarlosDerSeher Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

craigmillard86 commented Jan 17, 2026 •

edited

Loading

CarlosDerSeher commented Jan 18, 2026 •

edited

Loading

CarlosDerSeher commented Jan 18, 2026 •

edited

Loading