-
Notifications
You must be signed in to change notification settings - Fork 2
Architecture
This page explains how the Sendspin SDK components work together to enable synchronized multi-room audio playback.
┌─────────────────────────────────────────────────────────────────┐
│ Your Application │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ SyncCorrectionCalculator │ Your Resampler/Drop Logic │ │
│ │ (correction decisions) │ (applies correction) │ │
│ └─────────────────────────────────────────────────────────┘ │
├─────────────────────────────────────────────────────────────────┤
│ SendspinClientService │ AudioPipeline │ IAudioPlayer │
│ (protocol handling) │ (orchestration) │ (your impl) │
├─────────────────────────────────────────────────────────────────┤
│ SendspinConnection │ KalmanClockSync │ TimedAudioBuffer │
│ (WebSocket) │ (timing) │ (reports error) │
├─────────────────────────────────────────────────────────────────┤
│ OpusDecoder │ FlacDecoder │ PcmDecoder │
└─────────────────────────────────────────────────────────────────┘
Audio flows through the SDK like this:
Server SDK Your App
│ │ │
│ WebSocket binary │ │
│ (audio chunks) │ │
├───────────────────────>│ │
│ │ │
│ ┌────┴────┐ │
│ │ Decoder │ │
│ │ (Opus/ │ │
│ │ FLAC) │ │
│ └────┬────┘ │
│ │ PCM samples │
│ ┌────┴────┐ │
│ │ Timed │ │
│ │ Audio │ │
│ │ Buffer │ │
│ └────┬────┘ │
│ │ samples + sync error │
│ ┌────┴────┐ │
│ │ Sample │ │
│ │ Source │ │
│ └────┬────┘ │
│ │ │
│ ├──────────────────────────────>│
│ │ IAudioPlayer.Read() │
│ │ ┌────┴────┐
│ │ │ WASAPI/ │
│ │ │ ALSA/ │
│ │ │ CoreAu │
│ │ └────┬────┘
│ │ │
│ │ [Speakers]
| Namespace | Purpose | Key Types |
|---|---|---|
Sendspin.SDK.Client |
Protocol client, capabilities |
SendspinClientService, ClientCapabilities
|
Sendspin.SDK.Connection |
WebSocket transport |
SendspinConnection, ConnectionState
|
Sendspin.SDK.Audio |
Audio pipeline, buffer, decoders |
AudioPipeline, TimedAudioBuffer, IAudioPlayer
|
Sendspin.SDK.Synchronization |
Clock sync |
KalmanClockSynchronizer, HighPrecisionTimer
|
Sendspin.SDK.Discovery |
mDNS discovery |
MdnsServerDiscovery, DiscoveredServer
|
Sendspin.SDK.Protocol |
Message types |
MessageSerializer, protocol messages |
Sendspin.SDK.Models |
Data models |
GroupState, TrackMetadata, AudioFormat
|
The main entry point for connecting to servers. Handles:
- WebSocket connection lifecycle
- Protocol handshake (
client/hello/server/hello) - Clock sync timing (burst of measurements on connect)
- Message routing (text → handlers, binary → audio pipeline)
- Command sending (play, pause, volume, etc.)
var client = new SendspinClientService(
logger,
connection,
clockSync,
capabilities,
audioPipeline); // Optional - enables automatic audio
// Connect and go!
await client.ConnectAsync(serverUri);Orchestrates the audio flow from network to speakers:
States:
-
Idle→Starting→Buffering→Playing→Stopping→Idle - Can transition to
Errorfrom any state
Responsibilities:
- Create decoder based on audio format
- Manage timed buffer lifecycle
- Wait for clock sync convergence
- Control audio player (play/pause/stop)
var pipeline = new AudioPipeline(
logger,
decoderFactory,
clockSync,
bufferFactory, // Creates ITimedAudioBuffer
playerFactory, // Creates IAudioPlayer
sourceFactory, // Creates IAudioSampleSource
waitForConvergence: true);Synchronizes local time with server time using a Kalman filter. Key concepts:
The Problem: Server uses monotonic time (starts near 0), client uses Unix epoch. The offset can be billions of microseconds - this is normal.
The Solution: NTP-style 4-timestamp exchange:
- Client sends
client/timewith T1 (send time) - Server responds with T2 (receive time), T3 (reply time)
- Client records T4 (receive time)
- Kalman filter estimates offset and drift
Key Properties:
-
IsConverged- True after 5+ stable measurements -
HasMinimalSync- True after 2+ measurements (quick start) -
StaticDelayMs- Manual sync offset for speaker latency tuning
// Convert server timestamp to local time
var localTime = clockSync.ServerToClientTime(serverTimestamp);Stores PCM samples with server timestamps. Handles:
- Converting server timestamps to local playback time
- Calculating sync error (drift between expected and actual playback)
- Reporting when buffer is ready for playback
Key Properties:
-
SyncErrorMicroseconds- Raw sync error (positive = behind, negative = ahead) -
SmoothedSyncErrorMicroseconds- EMA-filtered for stability -
BufferedMilliseconds- Current buffer depth -
IsReadyForPlayback- True when target buffer reached
// SDK calculates error, you apply correction
var syncError = buffer.SyncErrorMicroseconds;
// After applying drop/insert correction:
buffer.NotifyExternalCorrection(dropped, inserted);The interface between SDK and your audio backend:
public interface IAudioPlayer : IAsyncDisposable
{
// State
AudioPlayerState State { get; }
float Volume { get; set; }
bool IsMuted { get; set; }
int OutputLatencyMs { get; }
// Lifecycle
Task InitializeAsync(AudioFormat format, CancellationToken ct);
void SetSampleSource(IAudioSampleSource source);
void Play();
void Pause();
void Stop();
Task SwitchDeviceAsync(string? deviceId, CancellationToken ct);
// Events
event EventHandler<AudioPlayerState>? StateChanged;
event EventHandler<AudioPlayerError>? ErrorOccurred;
}For multi-room sync, all players need to agree on "now". A 10ms discrepancy is audible as an echo effect.
The SDK uses a Kalman filter (not simple averaging) because:
- Handles variable network latency
- Estimates clock drift rate (how fast clocks diverge)
- Adapts to changing network conditions
- Converges quickly (~300-500ms for basic sync)
Windows DateTime has ~15ms resolution - useless for audio sync. The SDK uses Stopwatch.GetTimestamp() which provides ~100ns resolution:
// Always use this for timing-critical code:
var timeMicroseconds = HighPrecisionTimer.Shared.GetCurrentTimeMicroseconds();Starting with v5.0, sync correction is external. The SDK reports error; you decide how to correct.
Different platforms have different optimal strategies:
- Windows: WDL resampler (high quality)
-
Browser: Native
playbackRate(Web Audio API) - Linux: Hardware rate adjustment (ALSA)
- Embedded: Platform-specific DSP
SDK (reports only) Your App (applies)
───────────────── ─────────────────
TimedAudioBuffer SyncCorrectionCalculator
├─ SyncErrorMicroseconds ──> ├─ UpdateFromSyncError()
├─ SmoothedSyncErrorMicroseconds ├─ TargetPlaybackRate
├─ ReadRaw() ├─ DropEveryNFrames
└─ NotifyExternalCorrection() <── └─ InsertEveryNFrames
| Error | Method | Notes |
|---|---|---|
| < 1ms | None | Imperceptible |
| 1-15ms | Rate adjust | Smooth, uses resampling |
| 15-500ms | Drop/insert | Faster correction |
| > 500ms | Re-anchor | Clear buffer, restart |
Your app discovers servers and connects to them:
var discovery = new MdnsServerDiscovery(logger);
await discovery.StartAsync();
// Discovers _sendspin-server._tcp servicesYour app advertises itself; servers connect to you:
var hostService = new SendspinHostService(loggerFactory, capabilities, pipeline, clockSync);
await hostService.StartAsync();
// Advertises as _sendspin._tcp, servers connect to us| Type | Direction | Purpose |
|---|---|---|
client/hello |
C→S | Handshake with capabilities |
server/hello |
S→C | Server identification |
client/time |
C→S | Clock sync request |
server/time |
S→C | Clock sync response (4 timestamps) |
stream/start |
S→C | Audio stream begins |
stream/end |
S→C | Audio stream ends |
stream/clear |
S→C | Clear buffer (seek/track change) |
group/update |
S→C | Playback state, metadata |
client/command |
C→S | Play, pause, next, etc. |
Format: [1 byte type][8 bytes timestamp][payload]
| Type (first byte) | Purpose |
|---|---|
| 4-7 | Audio data (slots 0-3) |
| 8-11 | Artwork (slots 0-3) |
| 16-23 | Visualizer data |
See Also:
- Building a Minimal Player - Complete working example
- API Reference - Interface quick reference