Skip to content
This repository was archived by the owner on Jun 2, 2025. It is now read-only.
This repository was archived by the owner on Jun 2, 2025. It is now read-only.

In the case of the connection to photon failing, need to define the recovery and visibility of the failure #61

@daviddawson

Description

@daviddawson

Currently when the connected photon restarts, the following happens

  • Shutting down shared-route due to channel failure - Muon Core detects the photon instance is no longer available
  • Error in subscription Connection lost to remote service, the channel has shut down due to a transport failure - Newton drops the active replay
  • Subscribing to event stream 'newton-sample/Task' for full local replay - attempts to reconnect to the streams
  • NewtonEvent subscription has ended, will attempt to reconnect in 5000ms - backoff behaviour
  • Subscribing from index 26 to event stream saga-manager-newton-sample/Task 'newton-sample/Task' - Connection succeeds, replay now continues.

The issue here is the gap between disconnect and reconnect for event emit. Currently in MuonEventSourceRepository EventClient.event() is used without checking the return value. This means that events can be emitted, but not fully persisted.

Adding a check and fail on event failure will then expose a second issue, which is what to do in case of failure. If the failure occurs it takes a max of 1000ms (less in some circumstances) to be declared an error. If photon drops, it takes in the order of 5-7s for a full reconnect over the AMQP transport. As such, failure is fairly expensive. When an event emit fails, should the event persist operation be retried, or should it simply fail? Lastly, should the event protocol be updated to have a fallback SEDA mode to enable the transport to give further reliability.

Updates to muon-core will improve this, via client side load balancing (muoncore/muon-java#62) making HA photon that much more reliable and enabling transparent failover.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions