Skip to content

Add support of transient federates under decentralized coordination and physical connections#574

Open
ChadliaJerad wants to merge 24 commits intotransient-fedfrom
transient-fed-dec
Open

Add support of transient federates under decentralized coordination and physical connections#574
ChadliaJerad wants to merge 24 commits intotransient-fedfrom
transient-fed-dec

Conversation

@ChadliaJerad
Copy link
Copy Markdown
Collaborator

@ChadliaJerad ChadliaJerad commented Apr 4, 2026

This PR builds on the centralized transient support from the #358 PR of transient-fed branch, extending the reactor-c runtime to support transient federates under decentralized coordination, where all connections are P2P and no RTI message forwarding occurs. It also adds support for physical connections involving transient federates under centralized coordination as well, which are P2P connections.

lf-lang/lingua-franca#2609

Protocol extensions for decentralized transients

Connection

  • lf_connect_to_federate(), called when executing the preamble, is updated to accept an is_transient flag; transient outbounds are queried once (no retry loop) and connected only when the RTI sends MSG_TYPE_OUTBOUND_CONNECTED. Non-transient outbounds still use the original pattern and keep retrying every ADDRESS_QUERY_RETRY_INTERVAL until connected.
  • The RTI keeps track of which transients are outbound of each federate in the outbound_transients array and updates number_of_outbound_transients accordingly.
  • MSG_TYPE_OUTBOUND_CONNECTED and MSG_TYPE_OUTBOUND_DISCONNECTED are new messages. The RTI sends these to the federate whose downstream peers have connected or disconnected. When an outbound transient connects, the federate queries its address from the RTI. When it disconnects, the federate skips sending messages, thus avoiding an error writing to a broken pipe.
  • MSG_TYPE_ADDRESS_QUERY is extended with an is_transient byte so the RTI can register the querying federate's outbound-transient relationships.
  • outbound_p2p_connection_is_transient[NUMBER_OF_FEDERATES] and inbound_p2p_connection_is_transient[NUMBER_OF_FEDERATES] are added to federate_instance_t.

Adaptations

  • mark_inputs_known_absent(): for inbound transients, the tag is set to env->current_tag instead of FOREVER_TAG when the P2P socket closes. FOREVER_TAG permanently blocks ports from being updated after a transient rejoins, causing spurious Attempt to update to earlier tag errors and outbound STP violations. env->current_tag is sufficient to unblock the scheduler while leaving the port open to future updates.
  • Deferred P2P connection in get_start_time_from_rti(): MSG_TYPE_OUTBOUND_CONNECTED arriving during the start-time handshake is now deferred — the federate ID is drained from the socket immediately, but lf_connect_to_federate() (which itself reads from the RTI socket) is called only after MSG_TYPE_TIMESTAMP is received. This eliminates a race condition where the address-query reply consumed the timestamp bytes, leading to Unexpected reply of type 2.
  • notify_federate_disconnected(): now calls send_outbound_disconnected_locked() in addition to the existing send_upstream_disconnected_locked() calls, so inbound federates of a departing transient close their outbound P2P sockets.
  • RTI shutdown: suppressed the spurious WARNING: Failed to accept the socket. Invalid argument that fired because accept() returns EINVAL when unblocked by the intentional shutdown_socket() call at end of execution.

Tracing and visualization (fedsd)

  • send_OUTBOUND_CONNECTED / receive_OUTBOUND_CONNECTED and send_OUTBOUND_DISCONNECTED / receive_OUTBOUND_DISCONNECTED trace events are added to trace_types.h and fedsd.py.
  • P2P_MSG arrows now connect sender and receiver correctly in fedsd: matching is done by physical-time ordering rather than partner_id (which is -1 on both sides for direct P2P tracepoints).

@ChadliaJerad ChadliaJerad changed the title Transient fed dec Add support of transient federates under decentralized coordination and physical connections Apr 4, 2026
…dicating whether each federate is transient. This is useful for outbound messages manipulation
…f lf_handle_p2p_connection_to_transients thread
…dshake completes, avoiding the interleaved socket read that may cause a crash
…of downstream and inbound instead of upstream
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants