Conversation
Update dmsgpty-ui
Cli improvements
Fix dmsgget secret key error
GetServers: replace log.Fatal with log.Error + retry loop on dmsg-discovery errors and fix broken recursive call. ListenAndServe: replace log.Fatal with returned error on dmsg listen failure. Update dependencies.
Add GET /dmsg-discovery/servers/clients to return all client entries
grouped by delegated server, and GET /dmsg-discovery/server/{pk}/clients
for a single server. Includes store, API, and client implementations.
- Simplify response format: return client PKs instead of full entry objects
- /servers/clients: { "server_pk": ["client_pk1", ...], ... }
- /server/{pk}/clients: ["client_pk1", "client_pk2", ...]
Replace global math/rand.Shuffle with a locally-seeded random generator using crypto/rand. This ensures each dmsg client connects to servers in a truly random order, preventing load imbalance when multiple clients start simultaneously.
* Add HTTP endpoint documentation and examples to service help menus dmsg-discovery: Added endpoint list and example JSON responses dmsg-server: Added endpoint documentation JSON examples are colorized using tidwall/pretty. * Remove errant ANSI escape codes from flag descriptions The ANSI reset codes (\033[0m) and newlines (\n\r) in flag descriptions were causing issues with Cobra help menu rendering, disrupting color output and formatting. These codes have been removed from all flag descriptions across the codebase. * Revert "Remove errant ANSI escape codes from flag descriptions" This reverts commit 5a8ed71. * Vendor skywire 48c9b3e7bf79 (help menu improvements) * Vendor skywire f11c468 with coloredcobra help template fix Updates skywire to include color functions in the custom help template for usage=false mode, enabling colored output in dmsg CLI help. * Add complete response examples for dmsg-discovery endpoints Includes examples for: - GET /health - GET /dmsg-discovery/entry/{pk} (client and server entries) - POST /dmsg-discovery/entry/ (new and update responses) - DEL /dmsg-discovery/entry - GET /dmsg-discovery/entries - GET /dmsg-discovery/visorEntries - GET /dmsg-discovery/available_servers - GET /dmsg-discovery/all_servers - GET /dmsg-discovery/servers/clients - GET /dmsg-discovery/server/{pk}/clients * Use actual buildinfo in health example with fallback values * Show actual JSON arrays for list endpoint examples * Use actual DMSG server entries from embedded deployment config in examples * Style flags with defaults on new line to match skywire services
…kycoin#340) When a dmsg server returns a non-public IP address (e.g., from a LAN dmsg server), the client now logs a warning and continues trying other servers instead of immediately returning an error. This ensures visors connected to local dmsg servers can still obtain their public IP for survey generation.
* Pass TERM environment variable from client to PTY
- Add Env field to CommandReq struct for passing environment variables
through the RPC protocol
- Update Pty.Start to accept and merge client environment variables
with host environment (client vars override host vars)
- Capture essential env vars (TERM, COLORTERM, LANG, LC_ALL) from
CLI client and pass to remote PTY
- Default to TERM=xterm-256color if not set by client
- Set TERM=xterm-256color for UI sessions (web terminal)
- Remove debug log.Print("xxxx") in ui.go
- Update dependencies
This fixes terminal rendering issues caused by missing TERM variable
when connecting to remote PTY sessions.
* Improve dmsgpty-ui with resize support and reconnection
UI improvements:
- Add terminal resize support: client sends resize events to server,
server updates PTY dimensions via SetPtySize
- Add automatic reconnection with up to 5 retry attempts
- Add visual connection status messages (connecting, connected,
disconnected, reconnecting)
- Improve terminal styling with VS Code-like dark theme
- Add cursor blinking and better font settings
- Debounce resize events to prevent excessive updates
- Fix duplicate HTML closing tags at end of term.html
Server improvements:
- Add wsReader that intercepts JSON resize messages from WebSocket
- Parse resize messages with type="resize", cols, rows fields
- Forward regular terminal data to PTY unchanged
- Add bounds checking for terminal dimensions
Use binary WebSocket mode instead of text mode for PTY data. PTY output contains bytes that aren't valid UTF-8 (raw terminal escape sequences, binary data). Text mode WebSocket frames require valid UTF-8, causing "Could not decode a text frame as UTF-8" errors and disconnects. Binary mode handles raw terminal data correctly while the frontend already supports ArrayBuffer messages. Changes: - ui.go: Change websocket.MessageText to websocket.MessageBinary
…nce (skycoin#345) * Fix accept loop exits that permanently kill stream/connection acceptance Non-fatal errors in accept loops were causing permanent exit of the goroutine. The listener/session remains open but never accepts again, silently stopping all new connections until process restart. Fixed in: dmsgctrl ServeListener, server session smux and yamux stream accept loops. * Fix panics that can crash dmsg client, server, and pty host - client.go: Fix send-on-closed-channel race in session serve goroutine by adding mutex protection and select with default case - client.go: Fix unsafe type assertion on ctx.Value("dmsgServer") that panics if value is not a string - stream.go: Convert prepareFields from panic to error return so callers can handle noise initialization failure gracefully - host.go: Change whitelist error from Panic to Error log level so transient whitelist errors don't crash the pty host - noise.go: Replace panic in RemoteStatic with error log and empty key return so corrupted handshake data doesn't crash the process - read_writer.go: Replace panic in ReadRawFrame Discard with error return so reader errors propagate instead of crashing * Fix CI lint errors: errcheck on writeIPRequest, naked returns in stream.go - Add error check for prepareFields in writeIPRequest (errcheck) - Replace all naked returns with explicit returns in writeRequest, writeIPRequest, and readRequest to satisfy nakedret linter * Fix pre-existing lint errors in dmsgpty - ui.go: Check bw.Close() return value (errcheck) - ui_html.go: Remove blank lines after opening brace (gofmt) - ui_html.go: Cap gzip decompression with LimitReader (gosec G110) * Fix errcheck lint: use nolint directive for deferred bw.Close() * Fix gofmt alignment on nolint comment
Use errors.Is(err, dmsg.ErrEntityClosed) to cleanly exit the accept loop on shutdown instead of logging warnings for every pending accept.
* Fix DialStream to try alternative servers on stream failure DialStream returned immediately on the first existing session's stream dial failure without trying other delegated servers. This caused persistent failures when a dmsg server relay was broken, even though other servers could relay successfully. Now both phases (existing sessions and new sessions) continue to the next server when DialStream fails, matching the fallback pattern already used by LookupIP. * Fix FallbackRoundTripper request body consumption on retry When the first transport fails, RoundTrip consumes the request body. Subsequent transports receive an empty body, causing POST/PUT requests to silently fail. Buffer the body upfront and reset it for each retry attempt. * Fix multiple bugs in core dmsg server/client and CI lint error - Fix gosec G104 lint error in FallbackRoundTripper (CI fix) - Defer wg.Done() in server session goroutine to prevent Close() hang on panic - Move delEntry inside once.Do so Server.Close() is fully idempotent - Add missing continue after empty server discovery to avoid falling through to connection logic with zero entries - Reset client backoff to initial value on successful session establishment - Fix data race: hold sessionsMx when reading c.sessions in startUpdateEntryLoop and initilizeClientEntry * Fix 32 bugs across dmsg codebase noise: - Fix DecryptWithNonceMap never recording used nonces (replay attack) - Fix TCP conn leak in establishConn on post-dial failure - Fix Listener.Accept leaking conn on handshake failure - Fix handshake goroutine leak on timeout (set deadline to unblock) - Panic on DH crypto errors instead of silently returning zero key disc: - Deep-copy DelegatedServers slice in Entry.Copy() - Fix PutEntry corrupting caller's entry sequence on failure - Fix PutEntry returning wrong error variable on Entry() failure - Drain response body before close for HTTP connection reuse dmsgcurl: - Fix response body leak on maxSize error path - Fix division by zero in ProgressWriter when Content-Length unknown - Replace log.Fatal with error return in Download() - Fix -t 0 (unlimited retries) doing zero iterations dmsgpty: - Add mutex to protect global whitelist state from data races - Fix open() returning stale ErrNotExist after creating config file - Buffer excess WebSocket data in wsReader instead of discarding - Remove infinite keep-alive loop from writeWSError - Store exec.Cmd and call Wait() to prevent zombie processes (Unix) - Close ConPty handle on Spawn failure (Windows) - Use defer f.Close() in WriteConfig to prevent fd leak - Fix discarded strings.ReplaceAll result in conf.go dmsgctrl: - Add write mutex to prevent concurrent conn.Write corruption - Protect c.err with mutex to fix data race in Close()/Err() - Close leaked Control+connection when ServeListener channel full dmsg-discovery: - Fix inverted nil check that always overwrites caller's logger - Add defer r.Body.Close() in delEntry handler - Fix net.ParseIP nil dereference on hostname input - Use errors.Is() for wrapped error matching in handleError dmsg-server: - Buffer error channel to prevent deadlock - Fix deferred listener close racing with running goroutine - Move mutex lock before map reads in updateAverageNumberOfPacketsPerMinute * Fix additional bugs found in second pass cmd/dmsgcurl: - Fix defer inside loop leaking response bodies on retry - Fix closeAndCleanFile always seeing nil error (closure capture) - Fix division by zero in progress writer when Content-Length unknown cmd/dmsg-discovery: - Fix recursive getServers discarding return value - Fix data race on package-level err variable from goroutines cmd/dmsgweb: - Replace TrimRight with TrimSuffix for domain suffix stripping - Preserve signal context instead of replacing with Background() cmd/dmsgwebsrv: - Preserve signal context instead of replacing with Background() pkg/noise: - Copy ReadRawFrame data before Discard to prevent buffer aliasing - Remove no-op slice expression pkg/dmsghttp: - Fix goroutine leak in ListenAndServe when Serve returns early - Add nil check for server.Server before accessing ServerType pkg/direct: - Use direct map lookup instead of O(n) scan in Entry() pkg/disc: - Remove duplicate nolint comment
e2e-style tests (pkg/dmsgtest/e2e_test.go): - TestBidirectionalStreams: bidirectional data transfer at 32B/4KB/64KB - TestMultiServerStreams: streams across multiple servers and clients - TestConcurrentStreams: 20 simultaneous streams with data integrity - TestSessionReconnect: client reconnects after server shutdown - TestListenerAcceptAll: listener accepts multiple connections - TestPortOccupied: duplicate listen returns ErrPortOccupied - TestDialNonexistentClient: dial unknown PK returns ErrDiscEntryNotFound direct client tests (pkg/direct/client_test.go): - Entry lookup, post, delete, put operations - AvailableServers/AllServers filtering - AllEntries enumeration - ClientsByServer/AllClientsByServer grouping - GetClientEntry and GetAllEntries utility functions ioutil tests (pkg/ioutil/buf_read_test.go): - BufRead with exact fit, short buffer, empty data, large data noise nonce tests (pkg/noise/nonce_test.go): - DecryptWithNonceMap replay prevention - Out-of-order decryption with nonce map - Encrypt/decrypt roundtrip - Large payload (64KB) roundtrip
* Add tests for disc, dmsghttp, dmsgctrl, dmsgcurl, dmsgpty, dmsgserver and update dependency graph Improve test coverage across core packages: - pkg/disc: 24% -> 85.6% (client lifecycle, HTTP client, entry validation) - pkg/dmsghttp: 23.8% -> 65.5% (transport, GetServers, ListenAndServe) - pkg/dmsgctrl: 49.3% -> 84.9% (ServeListener, ping/pong, concurrency) - pkg/dmsgcurl: 16.2% -> 44.2% (URL parsing, progress writer, CancellableCopy) - pkg/dmsgpty: 43.1% -> 47.5% (whitelist, RPC utils, config) - pkg/dmsgserver: 0% -> 88.2% (config generation, flush) Also update README to use `go run github.com/loov/goda@latest` and regenerate the dependency graph SVG. * Eliminate internal packages to enable external testing Move all internal packages to pkg/ so they can be imported and tested by external packages, addressing the testing infrastructure limitation. Package moves: - internal/servermetrics -> pkg/dmsg/metrics - internal/discmetrics -> pkg/disc/metrics - internal/cli + internal/flags -> pkg/dmsgclient (merged) - internal/dmsg-discovery/api -> pkg/discovery/api - internal/dmsg-discovery/store -> pkg/discovery/store - internal/dmsg-server/api -> pkg/dmsgserver (merged with existing config) - internal/fsutil -> deleted (inlined os.Stat at single call site) Only internal/e2e/ remains, containing integration test infrastructure that is legitimately test-only. API renames in pkg/dmsgserver: API -> ServerAPI, New -> NewServerAPI * Refactor: extract cmd boilerplate, convert to go:embed, fix panics, add CloseQuietly Command boilerplate: - Add ExecName() and Execute() helpers to pkg/dmsgclient - Replace duplicated Use: expression and Execute() in all 13 cmd packages Embedded HTML: - Convert pkg/dmsgpty/ui_html.go from 5738-line hex literal to //go:embed with term.html.gz asset file (same runtime behavior) Panic fixes (library code only, tests left as-is): - pkg/dmsg/types.go: SignBytes, MakeSignedStreamRequest/Response now return errors - pkg/dmsg/util.go: encodeGob now returns error - pkg/dmsg/const.go: shuffleServers now returns error - pkg/dmsg/metrics/victoria_metrics.go: invalid delta logs instead of panicking - pkg/dmsgcurl/dmsgcurl.go: String() returns error string instead of panicking - pkg/dmsgpty/ui.go: writeHeader returns error instead of panicking - All callers updated to handle new error returns Error suppression: - Add pkg/ioutil.CloseQuietly for deferred Close() calls Also regenerate dependency graph SVG. * Split large files and add composable sub-interfaces File splits (same package, no API changes): - pkg/dmsg/client.go (723 lines) -> client.go + client_sessions.go + client_dial.go - cmd/dmsg-discovery/commands/dmsg-discovery.go (533 lines) -> dmsg-discovery.go + examples.go - pkg/dmsgclient/cli.go (553 lines) -> cli.go + cli_fallback.go Interface segregation (backwards-compatible, existing interfaces unchanged): - pkg/disc: Add EntryReader and EntryWriter sub-interfaces; APIClient now embeds them - pkg/discovery/store: Add EntryStore, ServerLister, EntryEnumerator sub-interfaces; Storer now embeds them All existing implementations continue to satisfy the original interfaces. The new sub-interfaces allow callers to accept narrower types. * Fix CI lint errors: gofmt formatting and errcheck in test files - Run gofmt on cmd files with formatting issues - Add //nolint:errcheck to test cleanup Close() calls - Fix indentation in test files * Fix bugs: resource leaks, race conditions, missing error handling HIGH: - cmd/dmsgweb: Fix nil deref crash when url.Parse fails in reverse proxy - cmd/dmsgweb: Add missing wg.Done() in SOCKS5 goroutine (deadlock on shutdown) - cmd/dmsgweb: Use defer for wg.Done() in proxyHTTPConn (deadlock if panic) - pkg/dmsg/client: Make errCh send non-blocking to prevent goroutine hang MEDIUM: - pkg/dmsg/server: Server.Close() now returns actual error instead of nil - pkg/dmsg/server: Close conn when smux/yamux Server init fails (TCP leak) - pkg/dmsg/client_sessions: Close conn on makeClientSession/mux failure (TCP leak) - pkg/dmsg/client: Retry initial post on failure instead of giving up - pkg/dmsg/entity_common: Copy session keys while holding mutex (was empty) - pkg/dmsg/entity_common: Wrap error context in getServerEntry/getClientEntry - pkg/dmsg/listener: Close drained streams on listener shutdown (resource leak) - pkg/dmsgserver: Acquire mutex in SetDmsgServer (data race) - pkg/dmsgcurl: Use caller's context for dmsgC.Serve (cancellation propagation) - cmd/dmsgweb: Fix duplicate DmsgDiscURL check (second should be DmsgDiscAddr) - cmd/dmsgweb: Close both conns after io.Copy to unblock goroutine LOW: - pkg/dmsg/client: Fix typo "successed" -> "succeeded", stop ticker leak - pkg/dmsgpty: Use %w instead of %v for error wrapping * Fix CI lint: gofmt, errcheck, gosec, misspellings in test files - Run gofmt on dmsg-server commands and dmsgserver api - Add //nolint:errcheck,gosec to test helper functions - Fix "cancelled" -> "canceled" misspelling (3 test files) - Add //nolint:gosec for G304 (file variable in test) and G114 (test http.Serve) * Fix remaining CI lint: gofmt, errcheck, gosec annotations * Fix CI lint: gofmt all files, add errcheck/gosec nolint on conn.Close * Fix CI: add gosec to remaining nolint annotations * Update vendor dependencies - github.com/skycoin/skycoin v0.28.3 -> v0.28.5-alpha1 (90b668188f85) - github.com/skycoin/skywire v1.3.35 -> v1.3.37 - golang.org/x/crypto v0.48.0 -> v0.49.0 - golang.org/x/net v0.51.0 -> v0.52.0 - golang.org/x/sys v0.41.0 -> v0.42.0 - Various other minor updates (smux, VictoriaMetrics, etc.) * Update CI to Go 1.26.x and golangci-lint v2.11.4 - Bump go-version from 1.25.x to 1.26.x in CI workflow - Bump golangci-lint from v2.6.1 to v2.11.4 (built with go1.26) - Simplify Makefile lint target to single ./... pass * Suppress new gosec G118/G115 rules from golangci-lint v2.11.4 * Move gosec G118 nolint to go func() line where error is reported * Update Dockerfiles to Go 1.26 (matches go.mod)
* Add pprof flags to all long-running dmsg services Add --pprofmode and --pprofaddr flags matching the skywire visor pattern to dmsg-server, dmsg-discovery, dmsgweb, dmsgpty-host, dmsghttp, and dmsg-socks5. Extract shared pprof utility to pkg/cmdutil/pprof.go to eliminate duplication. Supports cpu, mem, mutex, block, trace, and http profiling modes. * Update vendor dependencies go-toml/v2 v2.2.4 -> v2.3.0 skycoin v0.28.5-alpha1 -> v0.28.5 * Fix lint errors in pprof utility Fix nakedret and gosec G104 violations by using explicit returns and handling file close errors. * Vendor skycoin commit f48988877c68 Update github.com/skycoin/skycoin to f48988877c68c8f92773008b6d73ce7f6f357d1e
* Add ephemeral keypair pool for noise handshakes Pre-generate secp256k1 keypairs in a background goroutine and serve them from a buffered channel pool. This eliminates the per-handshake cost of EC key generation under load, allowing burst handling of concurrent handshakes without blocking on crypto operations. * Optimize noise handshake and encrypt/decrypt hot paths - Eliminate per-encrypt nonce buffer allocation by using a reusable [8]byte field in the Noise struct - Pre-allocate output buffer in EncryptUnsafe to avoid append growth - Add sync.Pool for write frame buffers to reduce allocation pressure - Skip redundant NewPubKey/NewSecKey validation in DH() since keys are already validated by the noise state machine (ECDH still validates) - Skip cipher.NewPubKey validation in RemoteStatic() since the key was already verified during the handshake
- actions/checkout: v3/v4 → v5 - actions/setup-go: v5 → v6 - docker/login-action: v2 → v4 - golangci/golangci-lint-action: v7 → v9 Node.js 20 actions are deprecated and will be forced to Node.js 24 starting June 2nd, 2026.
* Fix server CPU exhaustion under high stream load - Enforce maxSessions limit: reject new TCP connections when at capacity instead of accepting and logging a debug message - Add per-session concurrent stream limit (2048) using a semaphore to prevent a single session (e.g. setup-node) from spawning unbounded goroutines that starve the CPU - Add backoff delay (50ms) on non-fatal stream accept errors to prevent tight CPU spin loops when persistent errors occur - Streams that exceed the concurrency limit are immediately closed rather than queued, providing backpressure to the client * Revert maxSessions rejection to original behavior maxSessions only controls discovery advertisement, not connection acceptance. Services and visors connect to all servers regardless of advertised load, so rejecting sessions would break connectivity. * Add stream read deadline and fix indentation - Add read deadline (HandshakeTimeout) on initial stream request read so slow or malicious clients cannot hold goroutines and semaphore slots indefinitely. Deadline is cleared before the long-lived bidirectional copy loop. - Remove stale TODO comment in server accept loop - Fix indentation from previous revert * Ensure pprof HTTP server remains responsive under high load Run the pprof HTTP server on a dedicated OS thread via runtime.LockOSThread() and bump GOMAXPROCS by 1 to reserve a thread for it. This ensures the kernel scheduler gives pprof CPU time even when the Go runtime is saturated with thousands of stream-handling goroutines, which is exactly when pprof is needed most to diagnose the problem.
…skycoin#354) Streams that complete the handshake but never receive data would block in smux.waitRead indefinitely, holding their ephemeral port forever. Over time this exhausts all ~16K ports (49152-65535) on the Porter, causing "ephemeral port space exhausted" errors for new streams. Fix by adding a 2-minute idle timeout (StreamIdleTimeout) that is: - Set as a read deadline after the stream handshake completes - Refreshed on every successful read, so active streams are unaffected - Applied on both initiating (DialStream) and responding (acceptStream) Stale streams will time out, the caller gets an error, and the stream is closed — releasing its ephemeral port back to the pool.
* Fix audit findings: panics, deadlocks, underflow, and error handling - Replace panic with error return in updateServerEntry for empty addr - Fix integer underflow in available sessions calculation (clamp to 0) - Fix deadlock risk: move session callbacks outside sessionsMx lock, have callbacks acquire lock themselves to avoid recursive locking - Fix double-close in SessionCommon.Close: use else-if so only the active mux (smux or yamux) is closed, not both - Fix unbounded backoff growth when maxBO is 0 - Add logging when listener accept buffer is full (was silent drop) - Log when error channel is full and errors are dropped * Fix second audit pass: panics, unbounded reads, missing limits - Replace panic() with error return in hostMux.Handle and ServeConn path match — prevents crashes from malformed URL patterns - Cap PtyGateway.Read allocation to 64KB to prevent memory exhaustion from malicious or buggy RPC requests - Add MaxHeaderBytes (16KB) to dmsghttp server to mitigate slowloris - Remove stray println() debug output in dmsgpty-cli - Fix context.Background() replacing parent context in dmsghttp proxy setup — signal cancellation was being lost - Add 50ms backoff on temporary accept errors in dmsgpty host to prevent CPU spin on persistent transient errors * Fix path traversal in dmsgcurl output file handling When output is a directory, the URL path was joined directly without sanitization, allowing paths like ../../etc/passwd to escape the intended output directory. Use filepath.Base to extract only the filename component. * Vendor skywire commit a5facdc74e72 Update github.com/skycoin/skywire to a5facdc74e72d4a3562e90cf7318e0f235b6d48f Also updates skycoin, pgx, goldmark, and resolves genproto module conflict.
* Implement server-to-server mesh for cross-server client connectivity
Enable clients connected to different dmsg servers to communicate by
having servers peer with each other. This removes the scaling limitation
where clients must be on the same server to reach each other.
Design:
- Servers peer as clients to each other using existing session mechanism
(TCP + noise XK handshake + yamux), requiring no new transport code
- Peers configured via static config (no discovery dependency)
- When a server can't find destination client locally, it tries
forwarding through peer server sessions
- 1-hop maximum: peer servers only check local sessions, no further
forwarding (prevents loops without TTL)
- Original SignedObject forwarded as-is (client signature preserved)
- Backward compatible: no wire protocol changes, existing clients
work unchanged
Key changes:
- ServerConfig.Peers: static peer server list (PK + address)
- Server.peerSessions: outbound connections to peer servers
- Server.peerPKs: identifies incoming sessions as peer servers
- SessionCommon.isPeer: relaxes SrcAddr.PK check for forwarded requests
- ServerSession.forwardViaPeer: iterates peers on local lookup failure
- maintainPeerConnection: persistent connection with reconnect backoff
Config example:
"peers": [{"public_key": "02abc...", "address": "1.2.3.4:8081"}]
* Auto-discover peer servers from discovery
Servers now automatically discover and peer with all other servers
registered in dmsg discovery, in addition to statically configured
peers. A background loop queries AllServers periodically and
establishes peer connections to any new servers found.
Static config peers take priority and are always connected. Discovery-
based peers are additive — they're discovered and connected without
requiring any config changes.
This means in the current deployment, all dmsg servers will
automatically mesh with each other as long as they share the same
dmsg discovery.
* Add mesh fallback in DialStream and cross-server e2e test
DialStream now falls back to trying all existing sessions when the
target's delegated servers are unreachable. If the client's server is
meshed with the target's server, the request is forwarded through the
peer connection transparently.
The e2e test verifies: two servers peered via static config, each with
one isolated client (separate filtered discovery), cross-server dial
succeeds with bidirectional 1KB data transfer through the mesh.
* Prefer existing sessions over new connections in DialStream
Reorder DialStream to try mesh forwarding through existing sessions
before attempting to establish new server connections. The new order:
1. Existing sessions matching target's delegated servers (direct, free)
2. All other existing sessions via mesh (free, already connected)
3. New sessions to delegated servers (expensive, last resort)
This avoids unnecessary TCP+noise+yamux handshakes when the client
is already connected to meshed servers that can forward the request.
* Fix session handshake timeout and DefaultMaxSessions inconsistency
- Replace hardcoded 5s timeout in initClient/initServer with the
HandshakeTimeout constant (20s). The 5s was too aggressive and
inconsistent with the exported constant used elsewhere.
- Change DefaultMaxSessions from 100 to 2048 to match the actual
production default in dmsgserver config.
- Use dmsg.DefaultMaxSessions in dmsgserver GenerateDefaultConfig
instead of a hardcoded 2048, ensuring a single source of truth.
* Update vendor dependencies
bytedance/sonic/loader v0.5.0 -> v0.5.1
gin-contrib/sse v1.1.0 -> v1.1.1
* Add useful Makefile targets from skywire
Add targets ported from skywire's Makefile:
- update-dep: go get -u, tidy, vendor, auto-commit
- update-skywire: update skywire dep to latest develop
- update-skycoin: update skycoin dep to latest develop
- push-deps: commit and push vendor changes
- sync-upstream-develop: sync fork's develop with upstream
- tidy: standalone go mod tidy
- format now depends on tidy (like skywire)
- dep now depends on tidy
* Fix TODO audit: whitelist, waitgroup, kill workaround, stale comments
- Implement SOCKS5 whitelist enforcement: connections from PKs not in
the --wl list are now rejected (was a no-op despite accepting the flag)
- Add waitgroup to Client for clean goroutine shutdown on Close()
- Remove kill.go force-exit workaround: all commands now use
cmdutil.SignalContext for proper signal handling
- Document why timestamp tracking passes 0: concurrent streams from the
same client can arrive out of order, and noise nonce tracking already
prevents replay at the session level
- Remove resolved TODO on pty_client.go error choice
* Trigger CI re-run
* Improve test reliability for CI flaky tests
- TestControl_Ping: use require.NoError for fail-fast, close controls
in correct order (responder first) to avoid EOF race on pipe cleanup
- TestHTTPTransport_RoundTrip: use graceful srv.Shutdown() instead of
raw lis.Close() to let in-flight HTTP requests finish before closing,
preventing race between handler goroutines and listener teardown
* Fix data race on peerPKs map access
peerPKs was read in isPeerPK (from handleSession goroutines) and
written in discoverAndConnectPeers without synchronization. Protect
both accesses with peerSessionsMx.
The NonceMap (map[uint64]struct{}) grew forever on long-lived sessions,
accumulating one entry per decrypted message. For the setup-node
handling thousands of streams, this leaked megabytes of memory over time.
Replace with NonceWindow: a sliding window using a 1024-bit bitmap
(128 bytes) that tracks the highest nonce seen and the last 1024 nonces
for out-of-order replay detection. Memory usage is constant regardless
of session lifetime.
Since the transport is reliable (TCP via yamux/smux), nonces arrive
mostly in order, so a 1024-entry window is more than sufficient.
Nonces older than the window are rejected as replays.
The old NonceMap and DecryptWithNonceMap are kept but deprecated for
backward compatibility.
* Update README with badges, mesh docs, and dependency graph - Replace dead Travis CI badge with GitHub Actions badges (test, deploy, release), Go Report Card, OpenSSF Scorecard, go.mod version, and Arch Linux package badges - Document server-to-server mesh architecture and configuration - Add descriptions for dmsgweb, dmsghttp, and dmsg-socks5 tools - Expand architecture section with key concepts (sessions, streams, mesh) - Regenerate dependency graph with goda * Update README and disable failing deploy workflow - Remove deploy badge (always fails), add GoDoc badge - Replace "mesh" terminology with "relay" and "server-to-server connections" for accuracy — dmsg is an anonymous relay system - Hide deploy.yml workflow by renaming to .deploy.yml (keeps the file but GitHub Actions won't run it) - Document the dial order for cross-server relay - Clarify that relay servers cannot read stream contents * Remove GoDoc badge (no license file in repo) * Skip CI tests when only docs/non-code files change Add paths-ignore to test workflow so PRs that only modify markdown, docs, LICENSE, .gitignore, or CHANGELOG don't trigger the full test suite across all three platforms.
smux (unlike yamux) has no built-in ping. Implement it using a lightweight stream-level ping protocol: Client side (SessionCommon.Ping): - Opens a temporary smux stream - Writes a 2-byte zero marker [0x00, 0x00] (ping) - Reads 2-byte echo, measures RTT - Closes stream (5s deadline) Server side (serveStream): - Reads first 2 bytes of each new stream - If [0x00, 0x00]: echoes the marker back and closes (ping response) - Otherwise: passes the bytes through to readRequest via MultiReader The [0x00, 0x00] marker is safe because it represents a zero-length object, which cannot occur in normal session traffic (valid SignedObjects always have length > 0). Yamux sessions continue to use the built-in yamux.Ping().
If Close() runs before Serve() calls wg.Add(1), the WaitGroup counter is 0, Wait() returns immediately, and then Serve() calls Add(1) on a completed WaitGroup — a data race. Check the done channel before wg.Add(1) so Serve() returns ErrClosed if the server is already shut down.
skycoin#361) * Optimize DialStream with route caching, latency sorting, and entry caching - Add route cache: remember which server successfully reached a destination, try it first on subsequent dials, evict on failure - Sort sessions by measured ping latency so lowest-latency server is tried first instead of random map iteration order - Cache discovery entry lookups with 30s TTL to avoid re-querying HTTP discovery on every request - Background ping loop measures all session RTTs every 30s * Fix dmsgweb proxy: propagate request context and fix error handling - Use http.NewRequestWithContext to propagate browser request context to dmsg dial, so cancellations stop the stream dial immediately instead of waiting for the full 20s HandshakeTimeout - Remove impossible c.String(500) after c.Status() was already written, which caused "Headers were already written" warnings in gin * Refactor HTTPTransport to use http.Transport with dmsg DialContext Replace manual stream-per-request dial/write/read pattern with Go's http.Transport using a custom DialContext. Keep-alives are disabled because dmsg streams use noise-encrypted per-stream handshakes that make connection reuse unreliable (server ReadTimeout can expire between requests, and POST requests cannot be retried on stale connections). Benefits: - Proper request context propagation through the transport - Standard error handling and timeout support - Removes manual wrappedBody response draining hack - Normalizes dmsg:// URLs to http:// for Go's transport - Cleans up idle connections on context cancellation * Fix TCP proxy race, gin server leak, ReverseProxy Director, and HTTP timeouts - Fix TCP proxy io.Copy race: close both connections after first copy returns to unblock the second, preventing goroutine leak - Replace dlog.Fatal with error return on port overflow (was killing process) - Replace gin r.Run() with http.Server and graceful Shutdown on context cancel, preventing goroutine leak on shutdown - Pass context to proxyTCPConn/proxyHTTPConn for proper cancellation - Fix silent ReverseProxy Director failure: parse URL before creating proxy, return 500 on parse error instead of forwarding to wrong URL - Add 30s timeout to HTTP clients in dmsghttp/util.go to prevent hanging * Harden dmsgweb: connection limits, body limits, close error logging - Add connection semaphore (max 256) to server-side TCP proxy to prevent unbounded goroutine growth from many simultaneous connections - Fix server-side TCP proxy io.Copy race: close both connections after first copy returns, wait for goroutine with done channel - Add 10MB request body limit via http.MaxBytesReader in HTTP proxy - Log close errors at debug level instead of silently ignoring them * Fix CI lint errors: gosec, misspell, and unhandled errors - Fix G104 (gosec): handle Close() errors with debug logging instead of ignoring them in TCP proxy - Fix G112 (gosec): add ReadHeaderTimeout to HTTP server to prevent Slowloris attacks - Fix G118 (gosec): use parent context for DialStream instead of context.Background(); add nolint for intentional Background in graceful shutdown - Fix misspell: cancelled -> canceled in comment * Revert HTTPTransport to direct stream approach for CI compatibility The http.Transport wrapper with DisableKeepAlives caused timeouts on Windows CI and hangs on Linux CI due to Go's transport adding overhead (Connection: close headers, persistConn goroutines) that interacts poorly with noise-encrypted streams under concurrent load. Revert to the proven direct approach: dial stream, write request, read response, wrap body to close stream. Keep the dmsg:// URL normalization. * Remove unnecessary nolint:govet directive
* Fix shutdown hang: add timeout to discovery entry deletion Client.Close() called delEntry(context.Background()) which makes HTTP requests to the discovery server with no timeout. When discovery is accessed over dmsg (the transport being closed), the Entry() lookup falls back to HTTP-over-dmsg which hangs forever since the dmsg client is already shut down. Add a 5-second timeout context so Close() always completes. * Add hidden --with-kill flag for force-exit safety net Add a hidden persistent flag --with-kill that enables the force-exit goroutine (3x Ctrl+C = os.Exit). Available on all subcommands as a safety net when graceful shutdown hangs. Usage: skywire dmsg web --with-kill
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #
Changes:
How to test this PR: