This document specifies the communication protocol, message formats, and usage patterns for the DNDM library.
DNDM is designed for controlled environments where:
- The set of intents and interests is known at design time
- The set of clients/producers is known and controlled
- Primary use case is robotics internal message bus
- Runtime validation is minimal (known routes, trusted environment)
- Checks, limits, and validations can be added later (modular architecture)
- In-Process: Go channels via Direct endpoint (zero-copy)
- LAN: TCP/UDP via Network endpoint (computer-to-computer)
- Serial: Serial ports (computer-to-embedded, e.g., RPI ↔ RP2040/ESP32)
- User Interface: Network, NATS, or CLI on device (e.g., RPI with Bluetooth keyboard/joystick)
The system may use UDP broadcast to reduce network congestion:
- Messages are sent to all instances on the network
- Instances without interest should reject packets early (at network layer)
- Only interested instances process the message
- This allows efficient broadcast while minimizing processing overhead
All messages are framed with the following format:
[Magic Number: 4 bytes]
[Total Size: 4 bytes]
[Header Size: 4 bytes]
[Header: variable]
[Message Size: 4 bytes]
[Message: variable]
Magic Numbers:
0xFADABEDA: Standard frame with header0xCEBAFE4A: Headerless frame (for future use)
Size Fields:
- All size fields are 32-bit big-endian unsigned integers
- Total size includes header size, message size, and size fields
- Header size is the size of the protobuf-encoded header
- Message size is the size of the protobuf-encoded message
The header is a protobuf message with the following fields:
message Header {
uint64 receive_timestamp = 1; // Set by receiver, overwritten on reception
uint64 timestamp = 2; // Set by sender when message is constructed
Type type = 3; // Message type enum
bool want_result = 4; // Request response
bytes signature = 5; // For authentication (future)
string route = 6; // Route identifier
}Timestamp Handling:
timestamp: Set by sender when message is constructed (includes marshaling overhead)receive_timestamp: Set by receiver when message is received (overwrites any value)
-
INTENT (
Type_INTENT = 2):- Advertises availability to publish data on a route
- Contains: route, hops, ttl, register flag
- Sent when Intent is created
- Can be unregistered (register=false)
-
INTENTS (
Type_INTENTS = 3):- Batch advertisement of multiple intents
- Contains: list of Intent messages
- Used for efficient bulk operations
-
INTEREST (
Type_INTEREST = 4):- Advertises desire to receive data on a route
- Contains: route, hops, ttl, register flag
- Sent when Interest is created
- Can be unregistered (register=false)
-
INTERESTS (
Type_INTERESTS = 5):- Batch advertisement of multiple interests
- Contains: list of Interest messages
- Used for efficient bulk operations
-
NOTIFY_INTENT (
Type_NOTIFY_INTENT = 6):- Notifies Intent that Interest is available
- Contains: route
- Sent when Intent-Interest link is established
-
RESULT (
Type_RESULT = 7):- Response to a request message
- Contains: nonce, error code, description
- Sent when want_result=true
- MESSAGE (
Type_MESSAGE = 1):- Actual data payload
- Contains: user-defined protobuf message
- Route identifies the message type and path
-
PING (
Type_PING = 8):- Latency measurement request
- Contains: payload
- Responded with PONG
-
PONG (
Type_PONG = 9):- Latency measurement response
- Contains: receive_timestamp, ping_timestamp, payload
-
HANDSHAKE (
Type_HANDSHAKE = 50):- Connection establishment
- Contains: me (local peer), you (remote peer), stage, intents, interests
- Stages: INITIAL, FINAL
-
PEERS (
Type_PEERS = 51):- Peer discovery/announcement
- Contains: remove flag, list of peer IDs
-
ADDRBOOK (
Type_ADDRBOOK = 52):- Address book synchronization
- Contains: address book entries
Format: TypeName@path
Examples:
Foo@example.foobarBar@sensors.temperatureImage@cameras.front
Rules:
- Path must not contain
@or#characters - TypeName is the protobuf message name
- Path is hierarchical (dot-separated)
Format: prefix#Base64Hash
Examples:
example#AbCdEfGhIjKlMnOpQrStUvWxYz1234567890sensors#XYZ123...
Rules:
- Prefix must not contain
@or#characters - Hash is Base64-encoded SHA1 of
TypeName@fullpath - Prefix is used for routing, hash for security
Format: scheme://address/path?param1=value1¶m2=value2
Examples:
tcp://192.168.1.100:8080/robot.sensorsudp://example.com:9999/cloud.processorsserial:///dev/ttyUSB0/embedded.actuators?baud=115200
Components:
scheme: Transport protocol (tcp, udp, serial, etc.)address: Network address or device pathpath: Hierarchical peer path for routingparams: Query parameters for configuration
// Publisher
intent, err := router.Publish("example.data", &MyMessage{})
if err != nil {
// handle error
}
defer intent.Close()
// Wait for interest
select {
case route := <-intent.Interest():
// Send messages
intent.Send(ctx, &MyMessage{Data: "hello"})
}
// Subscriber
interest, err := router.Subscribe("example.data", &MyMessage{})
if err != nil {
// handle error
}
defer interest.Close()
// Receive messages
for msg := range interest.C() {
myMsg := msg.(*MyMessage)
// Process message
}// Multiple publishers can publish to same route
go publisher1(ctx, router)
go publisher2(ctx, router)
// Multiple subscribers can subscribe to same route
go subscriber1(ctx, router)
go subscriber2(ctx, router)Routes are matched exactly:
Foo@examplematches onlyFoo@exampleFoo@example.baris different fromFoo@example
Peer paths use prefix matching:
- Peer path
example.foomatches routes starting withexample.foo - Route
example.foo.barmatches peer pathexample.foo - Route
example.bardoes not match peer pathexample.foo
-
Message Ordering:
- Should messages maintain ordering across network?
- Should we support ordered vs unordered delivery?
- How to handle out-of-order messages?
-
Message Delivery:
- Should we support guaranteed delivery?
- How to handle message acknowledgments?
- Should we support message retransmission?
-
Message Priority:
- Should we support message priority levels?
- How to handle priority in routing?
- Should control messages have higher priority?
-
Advertisement Timing:
- When should intents/interests be advertised?
- Should we support re-advertisement?
- How to handle advertisement failures?
-
Batch Advertisement:
- When should INTENTS/INTERESTS be used?
- Should we support automatic batching?
- How to handle batch size limits?
-
Advertisement Scope:
- Should advertisements be scoped to peer?
- How to handle advertisement propagation?
- Should we support advertisement filtering?
-
Route Discovery:
- How should routes be discovered in mesh networks?
- Should we support route advertisement protocol?
- How to handle route conflicts?
-
Route Caching:
- Should we cache route mappings?
- How to handle route updates?
- Should we support route expiration?
-
Wildcard Routes:
- Should we support wildcard routes?
- How to match wildcard routes?
- Should we support regex routes?
-
Connection Establishment:
- How should connections be established?
- Should we support connection retries?
- How to handle connection failures?
-
State Synchronization:
- How should state be synchronized after connection?
- Should we exchange intents/interests during handshake?
- How to handle state conflicts?
-
Peer Discovery:
- How should peers discover each other?
- Should we support peer announcement?
- How to handle peer updates?
-
Error Propagation:
- How should errors be propagated?
- Should we support error channels?
- How to distinguish error types?
-
Error Recovery:
- How should errors be recovered?
- Should we support automatic retry?
- How to handle permanent errors?
-
Error Reporting:
- What errors should be reported to application?
- Should we support error callbacks?
- How to handle protocol errors?
-
Message Batching:
- When should messages be batched?
- How to determine batch size?
- How to handle batch timeouts?
-
Compression:
- Should we compress messages?
- Which compression algorithms?
- When should compression be used?
-
Flow Control:
- Should we implement flow control?
- How to handle backpressure?
- Should we support rate limiting?
-
Authentication:
- How should peers authenticate?
- Should we support TLS? mTLS?
- How to handle authentication failures?
-
Authorization:
- How to control access to routes?
- Should we support ACLs?
- How to handle authorization failures?
-
Message Signing:
- Should messages be signed?
- Which signing algorithm?
- How to manage signing keys?
-
Type Safety:
- How to improve type safety at API level?
- Should we support typed wrappers?
- How to avoid type casting?
-
Error Handling:
- Should we use error channels?
- How to distinguish error types?
- Should we support error callbacks?
-
Configuration:
- How should the library be configured?
- Should we support configuration files?
- How to handle configuration validation?
-
EasyRobot Integration:
- How to migrate from EasyRobot?
- What compatibility should be maintained?
- How to handle breaking changes?
-
Multi-Device Communication:
- How to handle multiple sensors/actuators?
- How to handle multi-board SOCs?
- How to integrate with online servers?
-
Deployment:
- How to deploy in different environments?
- How to handle configuration across devices?
- How to manage updates?
-
Unit Testing:
- How to test intent/interest behavior?
- How to test network code?
- How to test concurrent behavior?
-
Integration Testing:
- How to test mesh networks?
- How to test failure scenarios?
- How to test performance?
-
End-to-End Testing:
- How to test complete systems?
- How to test multi-device scenarios?
- How to test distributed behavior?