Skip to content

Publishing large message blocks all callback groups #559

@ottojo

Description

@ottojo

Generated by Generative AI

No response

Operating System:

Linux ade 6.8.0-101-generic #101~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Feb 11 13:19:54 UTC x86_64 x86_64 x86_64 GNU/Linux

ROS version or commit hash:

kilted

RMW implementation (if applicable):

rmw_cyclonedds_cpp

RMW Configuration (if applicable):

<CycloneDDS xmlns="https://cdds.io/config" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="https://cdds.io/config https://raw.githubusercontent.com/eclipse-cyclonedds/cyclonedds/master/etc/cyclonedds.xsd">
  <Domain Id="any">
    <General>
      <Interfaces>
        <NetworkInterface autodetermine="false" name="lo" priority="default" multicast="default" />
        <!-- WiFi link to laptop -->
        <NetworkInterface autodetermine="false" address="192.168.142.0" priority="default"
          multicast="default" presence_required="false" />
      </Interfaces>
      <AllowMulticast>true</AllowMulticast>
      <MaxMessageSize>65500B</MaxMessageSize>
    </General>
    <Discovery>
      <MaxAutoParticipantIndex>100</MaxAutoParticipantIndex>
    </Discovery>
    <Internal>
      <SocketReceiveBufferSize min="10MB" />
      <Watermarks>
        <WhcHigh>500kB</WhcHigh>
      </Watermarks>
    </Internal>
  </Domain>
</CycloneDDS>

Client library (if applicable):

rlcpp

'ros2 doctor --report' output

ros2 doctor --report
   NETWORK CONFIGURATION
inet         : 127.0.0.1
inet4        : ['127.0.0.1']
inet6        : ['::1']
netmask      : 255.0.0.0
device       : lo
flags        : UP,LOOPBACK,RUNNING,MULTICAST
mtu          : 65536
inet         : 192.168.142.2
inet4        : ['192.168.142.2']
netmask      : 255.255.255.0
device       : enp87s0
flags        : UP,BROADCAST,RUNNING,MULTICAST
mtu          : 1500
broadcast    : 192.168.142.255
inet         : 100.64.0.25
inet4        : ['100.64.0.25']
netmask      : 255.255.255.255
device       : tailscale0
flags        : UP,POINTOPOINT,RUNNING,NOARP,MULTICAST
mtu          : 1280
inet         : 172.17.0.1
inet4        : ['172.17.0.1']
netmask      : 255.255.0.0
device       : docker0
flags        : UP,BROADCAST,MULTICAST
mtu          : 1500
broadcast    : 172.17.255.255
device       : enx00e04c440cc0
flags        : UP,BROADCAST,MULTICAST
mtu          : 1500

   PACKAGE VERSIONS
rclcpp                                    : latest=29.5.7, local=29.5.6
rmw_cyclonedds_cpp                        : latest=4.0.2, local=4.0.2

   PLATFORM INFORMATION
system           : Linux
platform info    : Linux-6.17.0-14-generic-x86_64-with-glibc2.39
release          : 6.17.0-14-generic
processor        : x86_64

   QOS COMPATIBILITY LIST
compatibility status    : No publisher/subscriber pairs found

   RMW MIDDLEWARE
middleware name    : rmw_cyclonedds_cpp

   ROS 2 INFORMATION
distribution name      : kilted
distribution type      : ros2
distribution status    : active
release platforms      : {'debian': ['bookworm'], 'rhel': ['9'], 'ubuntu': ['noble']}

Steps to reproduce issue

On robot host: Create a node that at the same time

  1. Processes incoming (from same host) data from a subscription at high frequency
  2. Publishes large messages from a timer callback at low frequency
    With both the timer and subscription callback in different callback groups, using multithreaded executor.

On remote host (connected via fast wifi link, iperf3 shows >600Mbit/s): Subscribe to large messages

Expected behavior

Long duration of timer callback due to synchronous publishing behavior when subscribed from remote host, but no impact on other callback group processing incoming data.

Actual behavior

While publishing from the timer callback, no subscription callbacks are executed.
Here a measured execution timeline: at 8500ms the subscriber on the remote host is started.
I added a sleep in the first part of the timer callback (orange), to verify that callbacks are actually executed in parallel, and locking of the entire executor only happens during publisher_->publish().
Seemingly, the subscription callbacks (pink) are collected during the blocked duration and are all executed at once when publishing finishes.
Image

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions