Add support of transient federates under decentralized coordination and physical connections#574
Open
ChadliaJerad wants to merge 24 commits intotransient-fedfrom
Open
Add support of transient federates under decentralized coordination and physical connections#574ChadliaJerad wants to merge 24 commits intotransient-fedfrom
ChadliaJerad wants to merge 24 commits intotransient-fedfrom
Conversation
…dicating whether each federate is transient. This is useful for outbound messages manipulation
…f lf_handle_p2p_connection_to_transients thread
…dshake completes, avoiding the interleaved socket read that may cause a crash
…of downstream and inbound instead of upstream
d3093b6 to
2900198
Compare
…into transient-fed-dec
…into transient-fed-dec
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR builds on the centralized transient support from the #358 PR of
transient-fedbranch, extending thereactor-cruntime to support transient federates under decentralized coordination, where all connections are P2P and no RTI message forwarding occurs. It also adds support for physical connections involving transient federates under centralized coordination as well, which are P2P connections.lf-lang/lingua-franca#2609
Protocol extensions for decentralized transients
Connection
lf_connect_to_federate(), called when executing the preamble, is updated to accept anis_transientflag; transient outbounds are queried once (no retry loop) and connected only when the RTI sendsMSG_TYPE_OUTBOUND_CONNECTED. Non-transient outbounds still use the original pattern and keep retrying everyADDRESS_QUERY_RETRY_INTERVALuntil connected.outbound_transientsarray and updatesnumber_of_outbound_transientsaccordingly.MSG_TYPE_OUTBOUND_CONNECTEDandMSG_TYPE_OUTBOUND_DISCONNECTEDare new messages. The RTI sends these to the federate whose downstream peers have connected or disconnected. When an outbound transient connects, the federate queries its address from the RTI. When it disconnects, the federate skips sending messages, thus avoiding an error writing to a broken pipe.MSG_TYPE_ADDRESS_QUERYis extended with anis_transientbyte so the RTI can register the querying federate's outbound-transient relationships.outbound_p2p_connection_is_transient[NUMBER_OF_FEDERATES]andinbound_p2p_connection_is_transient[NUMBER_OF_FEDERATES]are added tofederate_instance_t.Adaptations
mark_inputs_known_absent(): for inbound transients, the tag is set toenv->current_taginstead ofFOREVER_TAGwhen the P2P socket closes.FOREVER_TAGpermanently blocks ports from being updated after a transient rejoins, causing spurious Attempt to update to earlier tag errors and outbound STP violations.env->current_tagis sufficient to unblock the scheduler while leaving the port open to future updates.get_start_time_from_rti():MSG_TYPE_OUTBOUND_CONNECTEDarriving during the start-time handshake is now deferred — the federate ID is drained from the socket immediately, butlf_connect_to_federate()(which itself reads from the RTI socket) is called only afterMSG_TYPE_TIMESTAMPis received. This eliminates a race condition where the address-query reply consumed the timestamp bytes, leading to Unexpected reply of type 2.notify_federate_disconnected(): now callssend_outbound_disconnected_locked()in addition to the existingsend_upstream_disconnected_locked()calls, so inbound federates of a departing transient close their outbound P2P sockets.WARNING: Failed to accept the socket. Invalid argumentthat fired becauseaccept()returnsEINVALwhen unblocked by the intentionalshutdown_socket()call at end of execution.Tracing and visualization (
fedsd)send_OUTBOUND_CONNECTED/receive_OUTBOUND_CONNECTEDandsend_OUTBOUND_DISCONNECTED/receive_OUTBOUND_DISCONNECTEDtrace events are added totrace_types.handfedsd.py.P2P_MSGarrows now connect sender and receiver correctly infedsd: matching is done by physical-time ordering rather thanpartner_id(which is-1on both sides for direct P2P tracepoints).