Skip to content

Add TCP fallback and automatic reconnection with exponential backoff#36

Merged
infinityabundance merged 4 commits intomainfrom
copilot/improve-network-connection-resilience
Feb 13, 2026
Merged

Add TCP fallback and automatic reconnection with exponential backoff#36
infinityabundance merged 4 commits intomainfrom
copilot/improve-network-connection-resilience

Conversation

Copy link
Contributor

Copilot AI commented Feb 13, 2026

Summary

Implements multi-layer network resilience: UDP remains primary transport, TCP fallback activates when UDP fails (firewalls, NAT issues), automatic reconnection with exponential backoff handles transient failures.

Details

  • New feature
  • Bug fix
  • Performance improvement
  • Documentation / tooling

What changed?

Transport Abstraction Layer

  • rootstream_net_send_encrypted() dispatches to UDP or TCP based on peer->transport
  • rootstream_net_recv() polls both UDP socket and active TCP peers
  • rootstream_net_handshake() tries UDP first, falls back to TCP on failure

TCP Fallback Transport (src/network_tcp.c)

  • Non-blocking connect with 5s timeout
  • Stream-based packet reassembly (TCP has no message boundaries)
  • Platform-agnostic (POSIX/Winsock abstraction)
  • Same ChaCha20-Poly1305 encryption as UDP

Automatic Reconnection (src/network_reconnect.c)

  • Exponential backoff: 100ms → 200ms → 400ms → ... → 30s (max)
  • 10 attempts max (~81s total), then peer removed
  • Tries UDP first, TCP fallback on each attempt

Connection Monitoring (src/service.c)

  • 30s timeout triggers reconnection
  • check_peer_health() in main loop

Data Structures

typedef enum {
    TRANSPORT_UDP = 1,  // Primary: low latency
    TRANSPORT_TCP = 2,  // Fallback: works through firewalls
} transport_type_t;

typedef struct {
    // ... existing fields ...
    transport_type_t transport;
    void *transport_priv;      // TCP connection state
    void *reconnect_ctx;       // Backoff state
    uint64_t last_received;    // For timeout detection
} peer_t;

Rationale

Problem: UDP blocked by firewalls/NAT = streaming fails immediately. Network drops = no recovery.

Solution: Multi-tier fallback aligns with RootStream's goal of reliability without complexity:

  • UDP first (low latency preserved)
  • TCP second (works everywhere)
  • Auto-reconnect (handles transient failures)

Typical UDP latency: 1-5ms. TCP adds 20-50ms overhead but maintains connectivity through restrictive networks.

Testing

  • Built successfully (make)
  • Syntax verification (gcc -c on new modules)
  • Code review completed, feedback addressed:
    • Extracted duplicate health check → check_peer_health()
    • Integer math for backoff (removed fmin())
    • Backward iteration when removing peers
  • Basic streaming tested
  • Tested on:
    • Distro: (requires full build environment)
    • Kernel:
    • GPU & driver:

Test Scenarios to Validate:

  1. Open network → UDP streaming (verify low latency maintained)
  2. Block UDP port → TCP fallback activates (verify handshake logs)
  3. Disconnect network 5s → auto-reconnect (verify backoff timing)
  4. Peer offline → 10 retries then removal (verify cleanup)

Notes

  • Latency impact: None on UDP path. TCP adds 20-50ms when fallback activates.
  • Resource usage: Negligible (~400 bytes per peer for reconnect state, TCP socket overhead only when active)
  • Follow-up work:
    • STUN/TURN for extreme NAT scenarios (out of scope)
    • Connection quality metrics (future enhancement)
    • Integration testing on actual networks with firewall restrictions
Original prompt

PHASE 4: Robust Networking & Connection Resilience

Current State

  • src/network.c - Core UDP protocol, encryption, handshake (fully implemented)
  • src/network_stub.c - Stub when NO_CRYPTO build
  • Missing: TCP fallback when UDP blocked/fails
  • Missing: NAT traversal/relay support infrastructure
  • Missing: Automatic reconnection logic with exponential backoff
  • Missing: Connection status monitoring and recovery
  • Missing: Peer state machine improvements

Problem

Currently in network code:

  • UDP protocol is solid but has no fallback if blocked by firewall/NAT
  • Connection failures are fatal (not recoverable)
  • No automatic retry with backoff
  • No relay/TURN infrastructure for extreme cases
  • Single network path = single point of failure

On networks with:

  • Strict firewalls blocking UDP
  • NAT traversal issues
  • Packet loss
  • Intermittent connectivity

Streaming fails immediately with no recovery path.

Solution: Multi-Layer Network Resilience

Tier 1: Direct UDP (Primary)

  • File: src/network.c (already exists)
  • Status: ✅ Complete and working
  • Method: Encrypted UDP P2P
  • Performance: Lowest latency, best quality
  • Availability: Open networks, home networks

Tier 2: TCP Fallback (New)

  • File: src/network_tcp.c (NEW)
  • Method: Encrypted TCP tunnel
  • Performance: Higher latency (~20-50ms more)
  • Availability: Works when UDP blocked

Tier 3: Connection Recovery (New)

  • File: src/network_reconnect.c (NEW)
  • Method: Automatic reconnect with exponential backoff
  • Performance: Seamless reconnect on temporary failures
  • Availability: Handles network interruptions

Implementation

File 1: src/network_tcp.c - TCP Fallback Transport

/*
 * network_tcp.c - TCP fallback transport when UDP blocked
 * 
 * Encrypted TCP tunnel for unreliable networks.
 * Uses same encryption/packet format as UDP for compatibility.
 * Slower but works everywhere TCP available.
 */

#include "../include/rootstream.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <poll.h>

typedef struct {
    int fd;                    /* TCP socket FD */
    struct sockaddr_in addr;
    bool connected;
    uint64_t connect_time;
    uint8_t read_buffer[MAX_PACKET_SIZE];
    size_t read_offset;
} tcp_peer_ctx_t;

/*
 * Try to establish TCP connection to peer
 */
int rootstream_net_tcp_connect(rootstream_ctx_t *ctx, peer_t *peer) {
    if (!ctx || !peer) return -1;

    int fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
    if (fd < 0) {
        fprintf(stderr, "ERROR: Cannot create TCP socket: %s\n", strerror(errno));
        return -1;
    }

    /* Set non-blocking */
    int flags = fcntl(fd, F_GETFL, 0);
    fcntl(fd, F_SETFL, flags | O_NONBLOCK);

    struct sockaddr_in *addr = (struct sockaddr_in *)&peer->addr;
    
    /* Connect (non-blocking) */
    if (connect(fd, (struct sockaddr *)addr, sizeof(*addr)) < 0) {
        if (errno != EINPROGRESS) {
            fprintf(stderr, "ERROR: TCP connect failed: %s\n", strerror(errno));
            close(fd);
            return -1;
        }
    }

    /* Wait for connection with timeout */
    struct pollfd pfd = { .fd = fd, .events = POLLOUT };
    int ret = poll(&pfd, 1, 5000);  /* 5 second timeout */

    if (ret <= 0) {
        fprintf(stderr, "ERROR: TCP connect timeout\n");
        close(fd);
        return -1;
    }

    /* Check for connection errors */
    int err = 0;
    socklen_t len = sizeof(err);
    if (getsockopt(fd, SOL_SOCKET, SO_ERROR, &err, &len) < 0 || err != 0) {
        fprintf(stderr, "ERROR: TCP connect error: %s\n", strerror(err));
        close(fd);
        return -1;
    }

    /* Connection successful */
    tcp_peer_ctx_t *tcp = calloc(1, sizeof(tcp_peer_ctx_t));
    if (!tcp) {
        close(fd);
        return -1;
    }

    tcp->fd = fd;
    memcpy(&tcp->addr, addr, sizeof(*addr));
    tcp->connected = true;
    tcp->connect_time = get_timestamp_ms();
    tcp->read_offset = 0;

    peer->transport_priv = tcp;
    peer->transport = TRANSPORT_TCP;
    
    printf("✓ TCP connection established to %s\n", peer->hostname);
    return 0;
}

/*
 * Send packet via TCP
 */
int rootstream_net_tcp_send(rootstream_ctx_t *ctx, peer_t *peer,
                           const uint8_t *data, size_t size) {
    if (!ctx || !peer || !data || size == 0) return -1;
    if (!peer->transport_priv) return -1;

    tcp_peer_ctx_t *tcp = (tcp_peer_ctx_t *)peer->transport_priv;
    if (!tcp->connected) return -1;

    size_t sent = 0;
    while (sent < size) {
        ssize_t ret = send(tcp->fd, data + sent, size - sent, MSG_NOSIGNAL);
        
        if (ret < 0) {
            if (errno == EAGAIN || errno == EWOULDBLOCK) {
                /* Socket buffer full, try again later */
                break;
            } else {...

</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

*This pull request was created from Copilot chat.*
>

<!-- START COPILOT CODING AGENT TIPS -->
---

💬 We'd love your input! Share your thoughts on Copilot coding agent in our [2 minute survey](https://gh.io/copilot-coding-agent-survey).

Copilot AI and others added 3 commits February 13, 2026 01:51
…straction

Co-authored-by: infinityabundance <255699974+infinityabundance@users.noreply.github.com>
Co-authored-by: infinityabundance <255699974+infinityabundance@users.noreply.github.com>
…math for backoff, iterate backwards when removing peers

Co-authored-by: infinityabundance <255699974+infinityabundance@users.noreply.github.com>
Copilot AI changed the title [WIP] Add TCP fallback and connection recovery logic Add TCP fallback and automatic reconnection with exponential backoff Feb 13, 2026
@infinityabundance infinityabundance marked this pull request as ready for review February 13, 2026 02:03
@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@infinityabundance infinityabundance merged commit fa57584 into main Feb 13, 2026
1 of 6 checks passed
@infinityabundance infinityabundance deleted the copilot/improve-network-connection-resilience branch February 19, 2026 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants