Better camera/gimbal message rate handling#3679
Better camera/gimbal message rate handling#3679robertlong13 wants to merge 6 commits intoArduPilot:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Introduces a lease-based MessageRateManager to centralize MAVLink message interval negotiation and reduce repeated/unsupported message spam, while updating camera/gimbal components to rely on interval negotiation rather than periodic request loops.
Changes:
- Added
MessageRateManager+MessageRateLeasefor per-message interval leasing, monitoring, and default-rate restoration. - Integrated the manager into
MAVLinkInterface, and adjusted camera/gimbal startup flows to negotiateSET/GET_MESSAGE_INTERVAL. - Updated camera/gimbal UI/control paths to use leases (streaming/tracking) and replaced
Console.WriteLinedebug output with an opt-in helper.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| ExtLibs/ArduPilot/Mavlink/MessageRateManager.cs | New lease-based manager for requesting/monitoring/restoring message rates. |
| ExtLibs/ArduPilot/Mavlink/MAVLinkInterface.cs | Instantiates/owns the manager; triggers OnConnectionOpen; awaits gimbal init. |
| ExtLibs/ArduPilot/Mavlink/GimbalManagerProtocol.cs | Switches gimbal discovery to interval negotiation; refactors attitude frame handling/logging. |
| ExtLibs/ArduPilot/Mavlink/CameraProtocol.cs | Switches camera discovery to interval negotiation; adds rate leasing for streaming + tracking; video stream retry helper. |
| ExtLibs/ArduPilot/CurrentState.cs | Uses camera rate update method; removes periodic gimbal Discover call. |
| Controls/GimbalVideoControl.cs | Converts always-on console debug output to opt-in debug helper; uses camera tracking lease API. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| var rm = parent.parent.RateManager; | ||
| double hz = ratehz; | ||
| string owner = $"Camera({parent.sysid},{parent.compid})"; | ||
|
|
||
| // Request FOV status | ||
| Task.Run(async () => | ||
| // Build new lease set, then swap and dispose old | ||
| var newLeases = new List<MessageRateLease> | ||
| { | ||
| await parent.parent.doCommandAsync( | ||
| parent.sysid, parent.compid, | ||
| MAVLink.MAV_CMD.SET_MESSAGE_INTERVAL, | ||
| (float)MAVLink.MAVLINK_MSG_ID.CAMERA_FOV_STATUS, | ||
| interval_us, | ||
| 0, 0, 0, 0, 0, | ||
| false // Don't wait for response | ||
| ).ConfigureAwait(false); | ||
| }); | ||
| rm.Subscribe(parent.sysid, parent.compid, MAVLink.MAVLINK_MSG_ID.CAMERA_FOV_STATUS, hz, owner), | ||
| rm.Subscribe(parent.sysid, parent.compid, MAVLink.MAVLINK_MSG_ID.CAMERA_SETTINGS, hz, owner), | ||
| rm.Subscribe(parent.sysid, parent.compid, MAVLink.MAVLINK_MSG_ID.CAMERA_CAPTURE_STATUS, hz, owner), | ||
| }; |
There was a problem hiding this comment.
Regression vs the previous behavior: CAMERA_SETTINGS and CAMERA_CAPTURE_STATUS are now leased unconditionally, whereas the previous code gated these requests based on capability flags (HasModes/HasZoom/HasFocus, CAPTURE_*). This can reintroduce the exact "unsupported message"/NACK/statustext spam the PR aims to reduce, especially on cameras/firmware that don’t implement these messages. Consider restoring the capability-based gating when building newLeases (and similarly for CAMERA_FOV_STATUS if it’s not universally supported).
There was a problem hiding this comment.
This was intentional and done after careful consideration. Gating the subscription based on HasModes/HasZoom/HasFocus would require me to track these flags as they potentially change, and those two messages I was gating are small and always compiled in (on AP anyway). Even so, the new rate manager now guarantees that I back off after one nack, so I can afford to try and fail and not have it lead to an error message every 38s.
Gating FOV_STATUS would be much more useful, but there isn't a capability flag for that, so there's no choice but to try once and back off when it NACKs.
| bool confirmed = false; | ||
| var sub = port.SubscribeToPacketType( | ||
| MAVLink.MAVLINK_MSG_ID.MESSAGE_INTERVAL, | ||
| msg => | ||
| { | ||
| var data = msg.ToStructure<MAVLink.mavlink_message_interval_t>(); | ||
| if (data.message_id != camInfoId) | ||
| return true; | ||
|
|
||
| if (data.interval_us == 0) | ||
| log.Info("Camera: CAMERA_INFORMATION not supported (interval_us=0)"); | ||
| else | ||
| log.InfoFormat("Camera: CAMERA_INFORMATION interval confirmed at {0} us", data.interval_us); | ||
|
|
||
| confirmed = true; | ||
| return true; | ||
| }, | ||
| parent.sysid, parent.compid); |
There was a problem hiding this comment.
confirmed is written from the SubscribeToPacketType callback thread and read from the async retry loop thread, but it’s not synchronized (volatile/locking/etc.). This risks the retry loop not observing the update (and continuing retries unnecessarily), and is an easy place for rare race bugs. Prefer a TaskCompletionSource<bool> / ManualResetEventSlim / Volatile.Read/Write pattern for cross-thread signaling rather than a plain local boolean.
There was a problem hiding this comment.
That race window is
- benign (one extra set attempt)
- tiny
That's not worth fussing over.
| // Build new lease set, then swap and dispose old | ||
| var newLeases = new List<MessageRateLease> | ||
| { | ||
| await parent.parent.doCommandAsync( | ||
| parent.sysid, parent.compid, | ||
| MAVLink.MAV_CMD.SET_MESSAGE_INTERVAL, | ||
| (float)MAVLink.MAVLINK_MSG_ID.CAMERA_FOV_STATUS, | ||
| interval_us, | ||
| 0, 0, 0, 0, 0, | ||
| false // Don't wait for response | ||
| ).ConfigureAwait(false); | ||
| }); | ||
| rm.Subscribe(parent.sysid, parent.compid, MAVLink.MAVLINK_MSG_ID.CAMERA_FOV_STATUS, hz, owner), | ||
| rm.Subscribe(parent.sysid, parent.compid, MAVLink.MAVLINK_MSG_ID.CAMERA_SETTINGS, hz, owner), | ||
| rm.Subscribe(parent.sysid, parent.compid, MAVLink.MAVLINK_MSG_ID.CAMERA_CAPTURE_STATUS, hz, owner), | ||
| }; | ||
|
|
||
| // Get camera settings | ||
| if (HasModes || HasZoom || HasFocus) | ||
| { | ||
| Task.Run(async () => | ||
| { | ||
| await parent.parent.doCommandAsync( | ||
| parent.sysid, parent.compid, | ||
| MAVLink.MAV_CMD.SET_MESSAGE_INTERVAL, | ||
| (float)MAVLink.MAVLINK_MSG_ID.CAMERA_SETTINGS, | ||
| interval_us, | ||
| 0, 0, 0, 0, 0, | ||
| false // Don't wait for response | ||
| ).ConfigureAwait(false); | ||
| }); | ||
| } | ||
| var old = _streamingLeases; | ||
| _streamingLeases = newLeases; | ||
| foreach (var lease in old) | ||
| lease.Dispose(); | ||
|
|
||
| // We use the capability flags directly here, and NOT whether we are currently able to do these things | ||
| var can_capture_video = (CameraInformation.flags & (int)MAVLink.CAMERA_CAP_FLAGS.CAPTURE_VIDEO) > 0; | ||
| var can_capture_image = (CameraInformation.flags & (int)MAVLink.CAMERA_CAP_FLAGS.CAPTURE_IMAGE) > 0; | ||
| if (can_capture_video || can_capture_image) | ||
| { | ||
| Task.Run(async () => | ||
| { | ||
| await parent.parent.doCommandAsync( | ||
| parent.sysid, parent.compid, | ||
| MAVLink.MAV_CMD.SET_MESSAGE_INTERVAL, | ||
| (float)MAVLink.MAVLINK_MSG_ID.CAMERA_CAPTURE_STATUS, | ||
| interval_us, | ||
| 0, 0, 0, 0, 0, | ||
| false // Don't wait for response | ||
| ).ConfigureAwait(false); | ||
| }); | ||
| } | ||
| _lastRateHz = ratehz; |
There was a problem hiding this comment.
_streamingLeases and _lastRateHz are mutated without synchronization, but they can be reached from multiple threads (e.g., ParseMessages on the receive thread triggers TakeStreamingLeases, while UpdateRateIfChanged is typically called from a different periodic loop/UI thread). This can cause missed disposals, double-dispose, or inconsistent _lastRateHz/lease state. Use a private lock around lease swap/dispose + _lastRateHz updates (and similarly in ReleaseStreamingLeases, SubscribeTracking, StopTracking).
There was a problem hiding this comment.
Yeah, I guess. This is another extraordinarily tiny window, but it's not self-healing, so I might as well lock.
| public async Task StartID(byte sysid, byte compid) | ||
| { | ||
| if (first_discover) | ||
| mavint.OnPacketReceived += MessagesHandler; |
There was a problem hiding this comment.
StartID now always attaches MessagesHandler to OnPacketReceived with no guard and no corresponding unsubscribe. Previously first_discover prevented repeated subscription. If StartID is called more than once per connection/component (reconnects, multiple detections, etc.), this will register duplicate handlers and apply parsing multiple times per packet (plus leak delegates). Add an idempotency guard (e.g., _started + Interlocked.Exchange) and/or uninstall the handler when appropriate (dispose/stop).
There was a problem hiding this comment.
Yeah, this is a good shout. It isn't currently possible for it to refire, but I've added a defensive idempotency guard as it's cheap, easy, and obviously correct.
| await MAVlist[tuple.Item1, tuple.Item2] | ||
| .GimbalManager.StartID(tuple.Item1, tuple.Item2) | ||
| .ConfigureAwait(false); |
There was a problem hiding this comment.
StartID contains up to ~15s of retry delay (3 * 5s), and this change now awaits it directly inside OnMAVDetected. That can stall the rest of the detection/initialization pipeline for that MAV/component and may delay other startup tasks. Consider making this fire-and-forget (capture/log the task) or decoupling the retry loop so it doesn’t block detection flow.
There was a problem hiding this comment.
this change now awaits it directly inside OnMAVDetected
No, it most certainly does not. You've missed the fact that this is inside of a Task.Run
| _packetCounts.TryGetValue(key, out var count); | ||
| _tickSnapshots[key] = (count, Stopwatch.GetTimestamp()); |
There was a problem hiding this comment.
FireAndForgetSet writes _tickSnapshots[key] even when the packet counter has already been removed (e.g., lease released concurrently with a tick sending a SET, or other race windows). That can reintroduce stale keys into _tickSnapshots and prevent the dictionaries from staying consistent. Only update _tickSnapshots if the key is still tracked (e.g., if (_packetCounts.TryGetValue(key, out var count)) { ... }) or check _leases.ContainsKey(key) under the same lock before writing.
| _packetCounts.TryGetValue(key, out var count); | |
| _tickSnapshots[key] = (count, Stopwatch.GetTimestamp()); | |
| if (_packetCounts.TryGetValue(key, out var count)) | |
| { | |
| _tickSnapshots[key] = (count, Stopwatch.GetTimestamp()); | |
| } |
There was a problem hiding this comment.
Yup, good point and good suggested fix. Done.
- Lease-based wrapper around GET/SET_MESSAGE_INTERVAL. - Fastest active lease wins; restores baseline on release.
Attitude status sends all the flags we need to determine the yaw frame, we don't need to hit up manager status for that. Manager status flags aren't even correctly populated in ArduPilot, but we lucked out because - Nothing was requesting manager status, and it isn't sent by default - No flags at all gets interpreted as body-frame yaw, which happens to be accidentally correct for ArduPilot anyway. We don't actually need manager status at all now, so I have removed it.
Changed to a no-op method that can be uncommented for debugging/testing.
6142a3a to
dc2348a
Compare
|
@meee1, I have addressed all copilot comments (3 good catches, 3 false positives) |
I meant to fix this up ages ago. When I first introduced the new camera/gimbal features in #3386, I took a pretty lazy approach to handling the message requests. Currently, we send out camera, gimbal, and video stream information requests every 38s. This is annoying for several reasons:
MessageRateManagerI added a dedicated message rate manager so classes that need certain a message can add it to a list to be requested and monitored. This could certainly be handy for some plugins too. The manager handles the requesting, monitoring, and re-requesting, and ultimately (when the caller is done with it) the restoring back to default rate. It also monitors for unsupported mesages (by subscribing to MESSAGE_INTERVAL responses with 0 interval, meaning "unsupported"), and stops spamming the aircraft for a message it can't send.
Almost all of this is done with fire-and-forget, with only the restore back to default rate requiring ACK. This checks giveComport first and backs off and comes back later if it was in-use. That narrows the race window significantly for double-giveComport.
Overall, this makes for much cleaner handling, especially in the future for any new classes that need to request messages.
I tested everything here in SITL, with multiple vehicles, multiple components, tons of leases subbing/unsubbing rapidly and I'm confident that it's solid.
Camera and Gimbal Managers
Instead of manually requesting a one-off message for camera, gimbal manager, and video stream information every 30s, I instead set the interval for these, waiting for acknowledgement that the requested rate was accepted (using MESSAGE_INTERVAL responses, not command ACKs), and then forgetting about them. I didn't use the new rate manager for these because I don't want the manager to keep trying to re-request them if they aren't sending. If an autopilot doesn't have a camera, it will respond "yes, I will send you camera information every 30s" but then not send those messages unless it actually has a camera. This is fine and good, what I'm negotiating here is "hey, when someone configures a camera on you, you'll let me know every 30s right?" and then "yup" and then I stop managing it and just listen if any come in. In the happy path, right after param download, the autopilot gets a few message interval SET/GETs and then nothing else unless necessary. Once a camera is detected, it sets up leases for the more rapid periodic messages it needs.
I chucked a couple drive-by fixes in here that I noticed too.
I've tested this against my old RealFlight testing setup I used during #3386 development, and it's still working.