Skip to content

Service clients freeze on multi-client cases. #372

@kjjpc

Description

@kjjpc

Bug report

  • Operating System:
    • Ubuntu 22.04
  • Installation type:
    • binaries
  • Version or commit hash:
    • ros-jazzy-ros-base=0.11.0-1 of apt package
  • DDS implementation:
    • Fast DDS, Cyclone DDS
  • Client library (if applicable):
    • rclcpp, rclpy

Steps to reproduce issue

I run 1 service server and 3 service clients which call service on high frequency as a stress test.
Attached are python codes, but same result on c++ codes.

Service Server

import rclpy
from rclpy.node import Node
from example_interfaces.srv import AddTwoInts

class MinimalService(Node):
    def __init__(self):
        super().__init__('minimal_service')
        self.srv = self.create_service(AddTwoInts, 'add_two_ints', self.add_two_ints_callback)

    def add_two_ints_callback(self, request, response):
        response.sum = request.a + request.b
        return response

def main(args=None):
    rclpy.init(args=args)
    minimal_service = MinimalService()
    rclpy.spin(minimal_service)
    rclpy.shutdown()

if __name__ == '__main__':
    main()

Service Client (run on 3 terminals)

import rclpy
from rclpy.node import Node
from example_interfaces.srv import AddTwoInts

class MinimalClientAsync(Node):
    def __init__(self):
        super().__init__('minimal_client_async')
        self.cli = self.create_client(AddTwoInts, 'add_two_ints')
        while not self.cli.wait_for_service(timeout_sec=1.0):
            self.get_logger().info('service not available, waiting again...')

    def send_request(self, a, b):
        self.req = AddTwoInts.Request()
        self.req.a = a
        self.req.b = b
        self.future = self.cli.call_async(self.req)
        rclpy.spin_until_future_complete(self, self.future)
        return self.future.result()

def main(args=None):
    rclpy.init(args=args)
    minimal_client = MinimalClientAsync()
    for i in range(1000000000):
        response = minimal_client.send_request(1, 2)
        if i%10000 == 0:
            minimal_client.get_logger().info(
                'Result of add_two_ints: for %d + %d = %d' %
                (1, 2, response.sum))

    minimal_client.destroy_node()
    rclpy.shutdown()

if __name__ == '__main__':
    main()

Expected behavior

Continue to display results from service server repeatedly

Actual behavior

The service client sometimes freezes in spin_until_future_complete. The service client doesn't progress at all when it freezes. Even though 2 of 3 clients are terminated, the freeze client doesn't continue the computation loop.
The freeze occurs not only high frequency cases but also low frequency multi-client cases.

Additional information

This issue is reported on ros2/rcl#1163.
The freeze is resolved by changing qos parameter like following.

        qos = qos_profile_services_default
        qos.history = HistoryPolicy.KEEP_ALL
        self.cli = self.create_client(AddTwoInts, 'add_two_ints', qos_profile = qos)

The default qos parameter of service is keep last 10 data, and it is not enough for multi-client cases.

In ROS2, all service response from service server is published to all clients, it means the buffer can be filled up by the responses for other callers and the response for the caller can be lost.

The default qos parameter of service client is not suitable for multi-client cases, which are often the case in ROS2. The default qos parameters should be changed from KEEP_LAST to KEEP_ALL.
The parameter is defined in qos_profiles.h.

RMW_QOS_POLICY_HISTORY_KEEP_LAST,

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions