ROS graph information tools implementation discussion

Thanks to @NikolausDemmel for making my point better than I did :wink: Yes, that is exactly what I was tryint to suggest.

if the command line tool wants to use the ROS interface to request
information from the daemon it again needs to wait for the discovery
phase to finish before it can do so.

Not necessarily. See http://eprosima-fast-rtps.readthedocs.io/en/latest/advanced.html#matching-endpoints-the-manual-way, for example.

This functionality would need to be exposed through the rmw interface in an abstract way. And it needs to be implementable with all current vendors. I am not sure that Connext / OpenSplice provide a similar API and if yes how their configuration option differ.

FWIW, http://design.ros2.org/articles/discovery_and_negotiation.html talks about static vs dynamic discovery. The way I understood it static discovery is exactly what we are talking about here (just in this case specifically only for one connection, from tool --> daemon). Is static discovery not within the scope of ROS2? (I believe static discovery is relevant not only for this use case).

An additional advantage of relying on ROS2 communication is that other consumers of the same API that don’t necessarily worry about short start-up time will just as well work using dynamic discovery.

1 Like

As for the general question, maybe @Jaime_Martin_Losa could shed some light on this.

For RTI Conext, you can pre-supply discovery peers, at least.

See “add_peer” in https://community.rti.com/static/documentation/connext-dds/5.2.0/doc/manuals/connext_dds/html_files/RTI_ConnextDDS_CoreLibraries_UsersManual/index.htm#UsersManual/ConfigPeersListUsed_inDiscov.htm#discovery_507287096_336417 This seems to be a Conext-specific API.

CoreDX (which you don’t use, but just as another datapoint) goes even further and implements the whole daemon-based central discovery solution for you.

Therefore, it might actually make sense to provide the topic list (or something from which it can directly be derived) in the rmw API, and let the vendor-specific solution use the best vendor-specific approach for supplying it. We would then need to fall back to our own daemon only when necessary.

btw, in general this is not such an outlandish feature, and such an obvious optimization, that I would be very surprised if a vendor did not support some means of realizing it.

1 Like

It isn’t necessarily an optimisation from the point of view of a lot of DDS use cases. The lack of need for a central discovery daemon was a major design goal in the way DDS works.

However, there are use cases where it is an optimisation, as you say. Perhaps using that optimisation, when it’s available, via rmw, and providing our own daemon as part of the rmw implementation when it’s not, is a valid implementation.

Thoughts on this approach?

Does anyone (@iluetkeb) have any data on which DDS implementations provide a list of available topics in near-instant time, and which do not? Remember that providing access to the list of topics seen by the participant since it started is not the same thing as what we need. We need something that accumulates the list of topics currently available, including those that the rostopic tool may not have seen itself, and can provide that list on demand. So far the CoreDX implementation that @iluetkeb mentioned sounds like it provides this functionality already, but I’m not aware of any others that do.

One of the reasons for choosing a well-known topic was so that discovery can be short-circuited in DDS implementations that support it.

1 Like

Hi guys,

On Fast RTPS you can listen to discovery data:

http://eprosima-fast-rtps.readthedocs.io/en/latest/advanced.html#subscribing-to-discovery-topics

Also, you can set the endpoints for discovery:

http://eprosima-fast-rtps.readthedocs.io/en/latest/pubsub.html#defining-input-and-output-channels

I started writing a prototype of the approach discussed in this thread. But then Worktm had other ideas, as it so often seems to.

The code is up on Github. It’s nowhere near complete, but perhaps there is enough to see where I was going. It’s not just rostopic, it’s intended to be a complete set of command line tools for working with the ROS graph.

I still want to keep working on it myself, but I think progress would be faster as a group effort.

Having said that, there has been activity recently on the ros2 github to implement a rosnode tool directly, so perhaps this work is moot.

@gbiggs: Are you referring to this [0] repo? If so, it’s true that we have two scripts, rosnode_list and rostopic_list, which however are not meant to be in any releasable or complete state. These are mainly the result of debugging tools needed during the current development on our side (they are neither built on the buildfarm nor listed in the repos file) . We are currently not actively working on implementing these tools, given the discussion of this thread and the remaining open questions.
We may add more scripts in that repository, which then can be helpful to others as well, we would though recommend not just yet open PRs against it.

[0] https://github.com/ros2/cli_tools

Yes, that’s the stuff I was talking about, @karsten. Thanks for explaining their purpose.

Hi all, I’d like to kick this thread back up.

Problem:

We are only able to access all nodes and all topics/types.
No equivalent rosnode info exists for ROS2 and these daemon’s are still not available.

Goal

Expose the node graph for gui and cli tools to show the user.
Expose the following per node via rcl API:

  • services
  • subscriptions
  • publications
  • actions

Solution

Regardless of the daemon or on-demand approach to accessing node graph details, exposing the node graph through the rcl layer for tools to use would be advantageous to developers.

These interfaces will be fulfilled via the Simple Discovery Protocol, which uses well-defined unicast and multicast ports for each participant to listen to meta-traffic.

Are there any objections to using SDP to discover the node graph?

What do you mean by “daemon” here?

I don’t see a need to introduce a secondary communication protocol. You should be able to retrieve the necessary from the underlying middleware. It will likely require to extend the rmw interface but each existing implementation should be able to provide the necessary information.

I agree with @dirk-thomas. The underlying middleware knows all that stuff already, and DDS has well-defined ports and a discovery protocol. I don’t see any advantage to using SDP, except as an alternative when DDS is not being used as the underlying middleware.

1 Like

Regardless of the nature of which tool access’s the node graph, the rcl layer should supply an interface to retrieve the node graph data. This “daemon” refers to the ros2cli daemon that is running for current node graph tools.

SDP IS “Simple Discovery Protocol” which FastRTPS and RTIConnext uses. However there has been some concern to using multicast for discovery in DDS in some production architectures. My proposal is to use the current discovery protocol if there are no objections.

Exactly, however it is not exposed through rcl. I propose I push code to expose this information through the rcl->rmw layer with FastRTPS.

I can’t speak for Dirk, but I got it mixed up with SSDP (Simple Service Discovery Protocol), which is the IETF-defined one. Sorry for the confusion.

There is a daemon provided in the ros2cli repository:

@dirk-thomas added it over a year ago. I haven’t looked into how complete it is or how it compares to the original proposal, but I think the intention is that there is one instance running on each computing node and it can be accessed at a known port on localhost.

I think your proposal to fulfil the interfaces in rcl using SDP doesn’t take into account whether or not this daemon is running.

Exactly, as fulfilling these interfaces does not require a daemon, but it will work with ros2cli’s daemon.

There are quite a few discovery protocols out there, I’ll switch to using full names instead of acronyms when describing approaches.

I will create the pull requests that will expose subscribers and publishers per node and post them here since there does not seem to be any objection using DDS’ Simple Discover Protocol. Then I will demonstrate that capability with ros2cli node info verb.

1 Like

I’d recommend giving slightly more detail about what you need to expose through the rmw API and how you’re going to represent it (e.g. are they entity ID’s, GUID’s, something else?) before committing the resources to make the full set of pull requests. Just because that seems to me to be the most likely location that we’ll have some issues with early on.

It’s totally solvable, but I just don’t want you to spend a huge amount of time only to have to refactor large parts due to miscommunication.

Also, something to keep in mind while implementing, is that currently there is one node per participant, but that’s actually been identified as a high impact performance issue, so at some point in the future we’d like to have multiple nodes per participant (not necessarily mapped by the user, but instead having one participant per process or something like that). So just keep that in mind when you’re working on this. It might make sense to depend on the one node per participant assumption for now, but if there are two reasonably similar solutions and one doesn’t depend on that assumption, you might want to take that one.

Excellent, as a rule I believe the only outbound data from the RMW layer should be ROS concepts, hiding any DDS notions such as entity IDS or GUID’s. I will post the representation API shortly.

This is interesting, is there a discussion behind the scenes that I could read about and join in?

Here is an example of the API we plan of implementing. I have chosen explicit functions for each node info over a more intense data structure for rcl to consume. If this is acceptable we can move forward. Let me know!

That looks good to me, especially passing the node handle to avoid needing any GUID’s or anything like that.

I do have a few things to point out however:


I’m somewhat concerned about the performance of the API design. As you said, it allows you to avoid any new or complicated data structures by calling these functions once for each node and entity type. However, two cases are less efficient: “listing all topics (publishers or subscribers), or services, for all nodes, by node” and “listing all topics and services on a given node”. Both require multiple calls to the API, but mostly the former case may require many calls if there are many nodes.

That all being said, I’m not very concerned about it since I don’t think performance will be a must-have in those cases (mostly for tools). And we can always add new API’s later that make this more efficient.


My other concern is related to the association of publishers to nodes. In the rmw_take_with_info call you can get the GUID of the publisher which sent the data (caller id), but that doesn’t necessarily let me figure out which node it came from (unless it was my own node and I have a handle to the publish that sent it). So I was hoping to eventually get not just the name and type of the remote publishers but also the GUID.

This is definitely not a blocker for this work, but it’s something to consider, since you might not even have the proposed signatures if you could get publisher by GUID (and therefore maybe create a struct that serves as a proxy for remote publishers which encapsulates information like GUID, node GUID, topic name, and type).


So like I said, I think this proposed API makes sense and is a good first step to tackle this issue.

If you want we can setup a short video call to hash out any other issues/questions you have and then post the result here on discourse.

1 Like

I completely agree, I was thinking of exposing an unzipped data structure containing lists ordered in such a way the rcl user could cache what they need. However there is no partial implementation of that interface should the RMW implementation be unable to access certain data. We can add bulk implementations should we need it going forward and address this issue.

This is true, and will be extremely interesting when we attempt to inspect a working system message by message. This API is meant for node graph tools that simple show connectivity, not on a per message basis.

Let’s set up a call and hammer out this API to get some node graph tools up and running.