Skip to content

Multinet: improve link auto‑recovery (TCP keepalive, SO_ERROR check, backoff reset)#4

Open
mkostersitz wants to merge 1 commit intopkoning2:mainfrom
mkostersitz:feature/multinet-auto-recover
Open

Multinet: improve link auto‑recovery (TCP keepalive, SO_ERROR check, backoff reset)#4
mkostersitz wants to merge 1 commit intopkoning2:mainfrom
mkostersitz:feature/multinet-auto-recover

Conversation

@mkostersitz
Copy link

Summary

Add TCP keepalive on outbound and accepted sockets (Windows/Linux/macOS) to detect half‑open peers and trigger reconnect.
Verify non‑blocking TCP connect success using SO_ERROR after POLLOUT to avoid false positives that stall.
Reset connection backoff on successful connect/bind to avoid long post‑outage delays.
Minor: fix factory error logging to use name before instance exists.
Rationale In field use, when a tunnel path or peer dies, links sometimes don’t recover until router restart. OS keepalives and robust connect result checks ensure dead connections are detected; the existing state machine’s reconnect path is then exercised. Resetting backoff on success avoids prolonged recovery delays after a return to health.

Implementation notes

decnet/host.py: add set_tcp_keepalive with platform‑specific tuning; apply in create_connection.
decnet/multinet.py: Connect mode: SO_ERROR verification in check_connection; conntmr.reset() on success.
Listen mode: conntmr.reset() after successful bind; enable keepalive on accepted sockets.

Tests

Light regression via existing Multinet tests (connect/reconnect/accept paths). No public API changes.

Let me know what you think :)

… and accepted sockets (Windows/Linux/macOS)\n- Verify non-blocking connect success via SO_ERROR before marking connected\n- Reset connect backoff timer on successful connect/bind to avoid long delays\n- Minor: fix factory error logging to use name before instance exists\n\nHelps links recover without router restart when a peer or path drops.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant