In our iRobot presentation (link) we discussed, among other things, some of the scalability issues when developing for hundreds of robots in the same network. We proposed some workarounds that involved accessing the configuration of the DDS layer (mainly the discovery settings) through its configuration file. However, this workaround has the problem that it is DDS implementation specific.
I talked to some people at ROSCON, such as @Sumanth-Nirmal from ApexAI, that were also interested in augmenting the ROS 2 client APIs to enable more control over the DDS settings. Particularly, for iRobot’s use-case, we would like to be able to specify the discovery settings, maybe passing some arguments through rclcpp::init.
I would like to know how many people are already thinking of these issues and see if we can kick-start a discussion about it.
could you specify what you want to control via rclcpp? what is that supposed to mean by discovery settings? while and black list? (BTW, localhost restriction is already implemented https://github.com/ros2/ros2/issues/798)
i think that the question is it has to be ROS2 generic or DDS specific.
@tomoyafujita, in our case we are interested in setting discovery options such as enabling multicast/ unicast, settings the list of initial peers and specifying the ports for discovery.
@gavanderhoorn, the native handlers are still a workaround, since as you say, the implementation will vary among different DDS vendors.
i am not sure what exactly your requirement is or actual use cases and reason, but maybe that is something we can take as a 1st step to share information. i would like to join that discussion.
@tomoyafujita, in the initial post I added a link to the ROSCON presentation we gave, in which we talked about some of the scalability issues we run into when having hundreds of robots in the same network. For some use-cases, for example, discovery done using multicast communication can be prohibitive due to network saturation.
so disabling multicast means your use case is NOT distributed system, right?
why these robots(or system) have to in the same network? what about software defined network? for me, it sounds like network configuration.
You can still have unicast-based discovery for a distributed system if you have an initial peers list. Multicast just gives you the advantage that the network package multiplexing is done by the switch, rather than the DDS participant itself.
Having multiple software defined networks is impractical for us. A single network has the capacity to handle all of these robots (most of the discovery packages are useless most of the time) if the system were correctly configured. DDS scalability problems have also been investigated by some researchers [1].
so you saying all of the system/robots under the network globe, they can be connected directly? (i believe this is no, you mean more like P2P discovery and connection, right?)
i was thinking that your use case is one host is kind of server to connect to each one of the robots in specific network.
I am really interested in this flexibility via ROS2, but since RMW implementation does not have to be DDS(as far as I know), I do feel like this could be something too specific for DDS. I really would like to hear more opinion.
most of the RMW implementations currently available are DDS-based and also some of the latest features introduced (QoS liveliness, deadline, etc) are part of the DDS specification, it’s up to the non-DDS implementations to decide whether to implement them or not.
I think that the reason why it would be nice to have control over discovery and connection APIs is due to the way in which DDS work (for example multicast by default). In an eventual non DDS implementation the main problem wouldn’t be present and it should be easy to enable a P2P based connection mechanism.
maybe we could have online discussion on this? no commitment, we can just talk frankly.
it could be better to include someone who knows about dds aspect and use cases.
how does it sound to you? or do you think of any suitable WG is already there? if so, we could take this into consideration as agenda.
Can I get an invite if the meeting discussion happens?.
I am working in the micro-ROS rmw layer, which is not DDS based, and I am personally interested in this matter.
Now we have to list all possible IP-numbers for the computers that can participate in the discovery and list it in a vendorspecific xml file whose syntax seems to change sometimes… A ROS2 way to do that would be very good.
We also want to have things like forbidding communication between certain peer. For example if we have 100 agents we want them to communicate to a control station but we would like to not allow them to communicate directly between each other. Or maybe allow just certain topics to be sent between these agents.
fyi Erik @eboasson added to domainTag to DDSI 2.3 spec (section 8.5.3.2 SPDPdiscoveredParticipantData) for iRobot use case find Roomba by serial # among ~1,000 robots on network. It addresses the need discussed above.
DDSI domainTag makes things talk if domain + tag match //CycloneDDS/Domain/Discovery/Tag
Text
String extension for domain id that remote participants must match to be discovered.
The default value is: "”