Skip to content

Conversation

@azerupi
Copy link

@azerupi azerupi commented Jan 1, 2026

This PR is a work in progress to add service introspection support to rclrs.

  • Configure service introspection for ClientState
  • Configure service introspection for ServiceState

I'm not all that familiar with the RCL interface and how to get and pass around the proper pointers. If anyone more familiar could double check the calls to RCL that would be appreciated.

I'm currently looking to add similar support for ServiceState but I can't implement it in the same way because it only stores a NodeHandle instead of the full Node and I'm not sure how to get the clock from it. I've asked on the Matrix chat what the reason was behind the asymmetry. I'm not sure if there is a way to work around this or if we need to make a breaking change and unify the APIs?


Edit

I've made the API changes in a second commit and implemented the introspection for ServiceState as well. If that is not the desired way forward it is easy to drop that commit.

@azerupi
Copy link
Author

azerupi commented Jan 1, 2026

It looks like CI is failing (at least one runner) because a transitive dependency doesn't support rustc 1.75 anymore:

error: package `async-lock v3.4.2` cannot be built because it requires rustc 1.85 or newer, while the currently active rustc version is 1.75.0
  Either upgrade to rustc 1.85 or newer, or use
  cargo update async-lock@3.4.2 --precise ver
  where `ver` is the latest version of `async-lock` supporting rustc 1.75.0

This is a dependency of async-std. Solving this would require us to either use a Cargo.lock or adding the problematic transitive dependencies to the Cargo.toml to pin them to a specific version.

@azerupi
Copy link
Author

azerupi commented Jan 1, 2026

The API changes in the second commit seem to have broken something but I'm not sure what. When I run cargo test locally repeatedly after the changes I have test failing inconsistently with error messages like:

Test executable failed (signal: 6 (SIGABRT) (core dumped)).

stderr:
malloc_consolidate(): unaligned fastbin chunk detected

By making the Service keep a reference to the
Node instead of the NodeHandle we were creating a
circular reference that prevented the Node from being
dropped.

NodeState -> ParameterInterface -> ParameterService
-> Service -> ServiceHandle -> NodeState

This change modifies the Service to keep a reference
to the NodeHandle again and to the Clock breaking
the circular reference.
@azerupi
Copy link
Author

azerupi commented Jan 1, 2026

By making the ServiceState keep a reference to the Node instead of the NodeHandle we were creating a circular reference that prevented the Node from being dropped.

NodeState -> ParameterInterface -> ParameterService -> Service -> ServiceHandle -> NodeState

I'm assuming the tests run in parallel and are interfering with each other when the nodes are not being dropped properly.

I reverted the ServiceState to hold a NodeHandle but also hold a reference to the clock that we need to configure introspection. I feel like this is an easy trap to fall in, is there a way to make it more obvious when we have a cycle? I wouldn't be surprised if someone else hits this issue in some way in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant