Investigation into alternative middleware solutions

Hello ROS community,

Every year, as part of the development for the next ROS 2 release, Open Robotics puts out a roadmap detailing what the core development team is going to work on. That roadmap, along with roadmap items from other TSC members, can be found here.

This year, the OSRC team at Intrinsic plans to do something slightly different. Rather than address a myriad of different tasks, the team is going to take on one very large task: the development of a non-DDS RMW implementation.

If you aren’t familiar with the concept of an “RMW”, it stands for “ROS MiddleWare”, and is the API that sits between the client libraries (i.e. ROS code) and the underlying communication mechanism (there’s a diagram of this layer here). The RMW interface is an abstraction layer that allows ROS to swap out its underlying communication mechanism at both compile and run time. As long as an RMW implements that API, it can be used as a communication mechanism.

Each RMW is assigned a ROS “support tier” roughly based on code quality and the amount of integration testing performed on the RMW. There are currently three Tier 1 RMW implementations, all of which are based on DDS and are tested nightly:

But others have created other RMW implementations over the years:

This brings us back to the development of what we are calling “rmw_alternate” (for now). Currently, all of the current Tier 1 RMW implementations use DDS, which means that DDS specific details can sometimes bias the RMW API. Further, DDS struggles in some situations, particularly where multicast UDP might be disabled or other network restrictions are in place (such as one might find in an office). We’ve also heard feedback from some members of the ROS community that DDS can be overwhelming and complex. rmw_alternate is meant to be an alternative for those who can’t or don’t want to use DDS, and aims to deliver a better out-of-the-box solution for educators and hobbyists.

I’m going to stress that DDS will always be part of ROS 2. Certain parts of our community use ROS 2 because it is based on DDS, and would not be able to use a different RMW. But having an alternative Tier-1 RMW implementation will help ensure that we don’t bias the RMW API with DDS details, and will give people who are struggling with DDS issues another path to take. Note that we are not targeting this to be the default RMW at this time; only another available option.

What does this mean for all of you?

We are purposely calling this “rmw_alternate” because we haven’t yet chosen what the underlying communication mechanism will be. What we want to hear from the ROS community is how you are using ROS 2, the issues you are facing, and what things you think would make your experience better. We want to generate our rmw_alternate requirements directly from feedback from the ROS community. To capture that feedback, please take the time to fill out this survey, which should take about 10-15 minutes. The responses will be anonymous, and we’ll leave the survey open for approximately 2 weeks.

We’ll use the data from this survey to come up with a list of requirements, and see how well the individual middlewares fit those requirements. Once we’ve chosen a middleware, we’ll spend the bulk of our Jazzy development time working on the new RMW.

If you have questions or comments about any of this, please feel free to respond to this thread.

44 Likes

Perhaps something like TCPROS?
Would be great to have that as a middleware layer for flawless transition into ROS 2 :sunglasses:.

11 Likes

If you put it into the survey as an option, we will consider it.

2 Likes

Why reinvent the wheel here ? What’s the reason for this development and taking the huge burden of maintenance and security ? Rather just give people who are struggling with DDS issues an easier way to solve them.

1 Like

Sorry, can you be more specific? Which part do you think is reinventing the wheel and taking on additional maintenance?

Sorry, maybe i didn’t understand your post correctly. My question was DDS is good right? what is the need for another middleware ?
You mentioned these:
DDS specific details can sometimes bias the RMW API. Further, DDS struggles in some situations, particularly where multicast UDP might be disabled or other network restrictions are in place (such as one might find in an office)

wouldn’t it rather make sense to address these issues instead of creating a new middleware.

Sorry if i came out rude, I am just confused :frowning:

Fantastic idea!

rmw_zenoh please.

7 Likes

OK, yeah. That’s a good question.

For the first one, it is very difficult to keep DDS-isms out of the RMW API when there is no alternative except DDS. People are naturally inclined to look at what is currently available, and make decisions based on that. This isn’t unreasonable, which is why we want to make another RMW as Tier-1. But on its own, I agree this isn’t a sufficient reason to add another one.

The other question has to do with DDS not working in some situations. Given our experience over the last 6 years of ROS 2, we’ve found that there are certain parts of the DDS specifications that just make it hard to work in all environments. I will be quick to say that many people are successfully using DDS in ROS 2, and have been able to workaround or otherwise configure their networks to avoid these limitations. But it is often not an easy experience, and the ramp-up time to get familiar with DDS and the configuration necessary in a particular network can be substantial. So the idea with the new RMW is to make something that works with less configuration, while still leaving DDS in place for those who need it or want to use it.

4 Likes

We have been planning to develop a ROS2 RMW using Robot Raconteur, but it has not been started yet. I have created an issue to discuss the possibility if anyone is interested: ROS2 RMW using Robot Raconteur · Issue #121 · robotraconteur/robotraconteur · GitHub . Robot Raconteur has been around for over a decade so the technology is mature at this point.

We have a RR<->ROS2 bridge currently that allows for ROS2 nodes to be accessed using the plug-and-play capabilities provided by Robot Raconteur: GitHub - robotraconteur-contrib/robotraconteur_ros2_bridge: Robot Raconteur to ROS 2 bride service node

There have been past discussions in the ROS-I circles around an OPC UA RMW.

1 Like

Great idea. Fantastic

A good alternative would be the CCSDS Asynchronous Message Service. It is located in the ION DTN package. It is feature rich and developed by NASA.

Very happy to see this proposal!

Any RMW that approaches the ease-of-use of the good old ROS1 transport would be a big step forward (and btw, there’s a reason why TCP is so popular, no shame in using an ancient protocol).
Here are the things that bother me about the current ROS2 transports that I’ve used (cyclone and fastrtps):

  • there is an element of “squirreliness” to them. I was very impressed the first time I used ROS2 and the remote hosts where discovered automatically. But every now and then they aren’t. And then I don’t know why not. Some combination of node restart and “ros2 daemon stop” eventually get things back to working, until they stop working again.
  • the lack of asynchronous publishing. Supposedly fastrtps can do it, but configuring it is very complicated. As it stands right now, if my robot publishes images over slow wifi, the publisher will block. Moreover, hardly any messages get through to the subscriber because wifi is fairly lossy and without tcp retransmissions, most images don’t get transmitted without packet loss.

For many situations in academia or hobby a static network configuration as in ROS1 is perfectly acceptable. The vast majority of students and hobby users does not work on swarm robotics, they have a single robot controlled from a laptop. They can live without discovery, but need a simple setup and predictable behavior.

I’m elated to hear that a tier-1 RMW implementation is on the horizon that may close that gap.

11 Likes

Hi Bernd,

Actually, Fast DDS Async publishing was the default behavior until humble. It is really easy to change. See this post on how to do it:

Just switching to async was simple, but that’s only easy when using udp. And with udp wifi transmission of images (2 megapixels) was terrible because without packet retransmission, virtually every frame experiences a dropped packet. I tried configuring tcp and gave up eventually. Here is more to the discussion.

I tend to land on the side of making it easier to use DDS than exploring alternatives. There was a reason DDS was chosen in the first place. And it has been shown to work well in varying environments although as mentioned, the ramp time to get it working might take longer for a novice. So making it easier to use for the vast majority of use cases would be better time well spent.

I would also rather spend the effort on integrating additional useful DDS features into the rmw that would help when scaling to larger systems much better e.g. time-filter, DDS keys. etc. Anytime you start exploring DDS alternatives, then you end up with the limited set of DDS features that we have now since a lot of DDS features don’t map directly to alternatives. Adding alternatives makes the inertia of adding additional DDS features that much harder. I feel like we are not using the full power of DDS because of this. Anyway, my 2 cents.

3 Likes

OpenCyphal (previously UAVCAN) could be an option, they already considered it : An exploratory study: UAVCAN as a middleware for ROS - Applications & Usage - OpenCyphal Forum

4 Likes

Hi @thejeeb

Indeed, that is what we do with the tools and enhancements we provide on Vulcanexus.org (Fast DDS, Micro-ROS, the ROS2 discovery server, the ROS2 router, ROS2 Monitor, and a long etc.)

At eProsima, we are committed to the DDS RWM and its tools, creating better docs and tutorials and exposing more DDS features. We have more than 20 engineers working on these areas, so new features are constantly added.

So, you can rest assured our team will continue to improve the DDS RMW. Moreover, we have a couple of important announcements in this direction to be made in the following months, that will increase significantly our contributions: extra funding and a larger team. Stay tuned.

I’ll offer up MQTT as an option that I think makes simpler assumptions about the setup of the underlying network than DDS, and thus can be a little simpler to grok, at the expense of adding in a central broker.

There are also multiple mature brokers and libraries, so long term maintenance of an rmw could be simpler than some of the more exotic options mentioned here, plus there are no signs it will go away any time soon.

2 Likes

I think that you have a point . MQTT is mature enough to be included in ROS2 as an alternative working out of the box middleware

1 Like