-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Nodes must be able to update their software at different times. To avoid errors due to incompatible versions, nodes must know for each peer message what version the peer used.
#606 adds versioning per node over the /status endpoint. Nodes with a version mismatch are treated as offline. This works most of the time and is therefore a valid solution to the immediate problem.
However, this isn't a complete design. Checking the version of the peer now and applying it to all messages sent by the peer we process from now o is not correct. Processed messages can be from the past. Delays may happen for many reasons, from network delays to node-internal buffers.
We discussed that a version field on every message would probably be the cleanest solution. It would give a clear architectural boundary and ensure we never handle messages that are not compatible.
This wouldn't replace the version field on /status, as we still need to know which nodes to include in new protocols.
From a practical point of view, however, it may not be such a high priority. Thanks to versions only changing after a node restarts, which takes a while, it can be hard to think of practical cases where it doesn't work. As long as a restart also cleans out all pending messages, it's basically impossible to have messages buffered locally from a peer running an older version, or from a newer version.
But it will become a practical issue if
- we add persistent queues for received peer messages
- we start accepting messages from other versions and use more fine-grained compatibility resolution
My suggestion: Postpone per-message versioning until we need it. As the system evolves and upgrades become more dynamic, we just have to keep in mind that the current design is somewhat brittle.