While daily supporting companies in developing autonomous robotics applications I noticed the tendence to continue to begin R&D stages with ROS instead of ROS 2, and I really cannot understand why.
ROS is old code style, it is server centered, it has not a safe communication Middleware, and many other disadvantages with respect to ROS 2.
Why should a company choose to do R&D with ROS instead of ROS 2?
In my opinion the main reason is in the fact that there are no clear documents explaining that ROS is reaching EOL (not so close, but close enough for a Companies R&D scheduling) and explaining the main advantages of ROS 2, or at least they are not highlighted enough.
Companies should clearly understand that if they begin a R&D stage with ROS, they will be very limited and when they will be ready to release a product, they will have to restart from the beginning because ROS finally reached EOLā¦ May 2025 is not so far.
What do you think?
The inertia of legacy code is a powerful thing.
No amount of deprecation warnings and documentation will stop some hardware out there from running Ubuntu 14.04 and ROS Indigo because āit still worksā.
For us, itās the out of the box experience where ROS wins. Maybe Jazzy with the simplified non-DDS comms will get a bit closer to that. Itās difficult if you donāt know whether your algorithm doesnāt work because itās wrong, or just because you were unable to configure the comms right. In ROS, the second option is negligible.
@peci1 I understand this point of view, but I do not agree with it.
Once you understand DDS and QoS, the advantages that you get are surely more important than the little time spent on it.
In my opinion, DDS is the strong point of ROS 2, it brings it closer to a certifiable security system. Ignoring DDS is terrible and looking at your profile Iām sorry to see that you are a researcher. You should be the first to understand that evolution is the most important thing.
Iām researcher at a university. Thatās a quite different environment. For us, each additional technology students need to understand before they can start working with a system is a big obstacle.
And I think in early stages of R&D, many companies have the same standpoint. Weāre currently helping a company to ROSify their robot, and as they did not know ROS before at all, they told they need to take it step by step. And (temporarily) getting rid of DDS is big help for them when debugging a system they barely understand. But there is a clear point on the roadmap that the system will switch over to ROS 2 and DDS in the future. I hope that Zenoh transport in Jazzy will allow this step-by-step approach completely within ROS 2, so it might really be the next big thing ROS 2 needed to get much more adopted.
looking at your profile Iām sorry to see that you are a researcher.
Holy gatekeeping, Batman! That was one of the most unnecessary and rude things Iāve read in a while.
I think if youāre a researcher, you donāt particularly care to spend your time tuning DDS or whatever, and itās true that the out-of-the-box performance of ROS 1 is currently better than ROS 2 in most cases.
Usability also took a big hit in ROS 2 for the Python client library, which many researchers use. rospy was waaaayyyy easier to use than rclpy currently is, in that you never had to worry about threading, deadlocks, and async programming in ROS 1.
I know many researchers that have worked with both ROS 1 and ROS 2 and have decided to ride the Noetic train as long as they can for these reasons. Justifiably so!
As a researcher attempting to quickly provide proof-of-concepts, why would a certifiable security system be a priority for me? Similarly, why would evolution be at all important? If anything itās a detraction: I donāt want to be messing around with a middleware that I have to think about, I want it to be effectively invisible so that I can focus on the research, not debugging the transport layer. ROS 2 offers a lot but ROS 1 still meets my needs without me needing to deal with a still-changing API.
I cannot agree moreā¦
Overall, the development environment is way better in ROS1.
I agree that ROS2 is better and has more features than ROS1 in terms of specifications.
In fact, I recently started to āupdateā the system that works with ROS1 to ROS2 mainly for multi-robot collaboration.
However, I realized many undocumented fancy things that made the development comfortable with ROS1 that were missing in ROS2.
I never imagined that a major version upgrade to ROS2 would bring the build prompt back to uncolored, lose the lovely function to move directories between packages with the simple command roscd
(Yes, there is a colcon-cd
but itās obviously immature compared to the roscd
and not installed by default), and that I would have to struggle with more-complicated-than-necessary-for-most-use-cases concepts of the parameters and launch files, which should be easy and simple enough to understand for helping developers/researchers with reducing the time by riding the ecosystem.
This reminds me of a couple other/older threads, such as Are there plans for a community supported ROS 1 release after Noetic?
Why should a company choose to do R&D with ROS instead of ROS 2?
In point of fact, most of them are not using ROS 1. I am working on the 2023 Metrics report and we are very close to having 60% of the community using ROS 2.
In my opinion the main reason is in the fact that there are no clear documents explaining that ROS is reaching EOL (not so close, but close enough for a Companies R&D scheduling) and explaining the main advantages of ROS 2, or at least they are not highlighted enough.
There are lots of resources that discuss the transition. Can you be more specific about what you think would help? We canāt make changes without concrete suggestions. Honestly it is very frustrating for many of us to be vaguely told, āX isnāt good enough, do it better!ā The core contributors are small team of people with limited time and resources. There are opportunity costs in choosing to do one thing over another.
Ignoring DDS is terrible and looking at your profile Iām sorry to see that you are a researcher. You should be the first to understand that evolution is the most important thing.
@Myzhar I donāt think your opinion is appropriate or particularly productive here. Individuals may have different priorities than you.
Having worked with and spoken to multiple companies is the mobile robots field, I disagree. For everybody using ROS on a single computer (so no ROS network com), DDS doesnāt bring immediate wins. And the lack of plug and play experience has slowed down ROS 2 adoption in the beginning.
For instance:
- Fast-DDS Service Reliability sometimes hangs lifecycle manager Ā· Issue #3033 Ā· ros-planning/navigation2 Ā· GitHub
- Obstacle Layer, Voxel Layer, and Costmap Topic Collision Checker not working in Humble/Rolling due to Fast-DDS Regression Ā· Issue #3014 Ā· ros-planning/navigation2 Ā· GitHub
It is much better now.
And I know companies with significant funding that are struggling to do the ROS2 transition due issues with ROS 2 and network com performance/configuration.
For context, I am a strong ROS 2 advocate and led the ROS 1 ā ROS 2 transition at a couple of companies. Its advantages outbalanced its pain points, but around me, DDS is definitely not the ROS 2 strong point.
We just made the move from ROS1 to ROS2, as 20.04 was EOL. And ohh my turned this move south quick.
As already mentioned, there were the tiny things, were one can nitpick about, like
- missing shortcuts in colcon
- docs.ros2.org / latest/api/rclcpp/ pointing to foxy and no forward to the new location
- Downgrade of the parameter system
- The default of ROS2 to connect to everything in the network (for those who did not move yet, it takes a while, to debug, that you are taking to your colleges node, and not your local one)
But as I mentioned, this is more or less nitpicking.
Then came the bad stuff like:
- Python nodes eating your CPU if message frequency is > 100 hz
- FastDDS Service recovery not working reliable
ā Note, Cyclone just works, but it took us half a month to try the switch, as we though OUR code was broken before - The FIFO Scheduler giving no warning if a timer takes longer than its scheduled period
ā This leads to an endless loop, were no further messages are processed - The Async API used in Services and Actions
ā Either you need to program a giant state machine with all corner cases for errors, or you need to enter multi threaded hell
ā Every call needs to be context aware, if a spinner is running or not - Bugs lots of them
ā Astonishing obvious stuff like, call get_status() in an action callback on the handle, and you will deadlock
ā We have multiple open merge requests now
ā We need to run our own custom build ros2 with various bugfixes
Also to be fair here you will mostly be fine, if you do not require the MultiThreaded executor. A lot of our bugs originated from race conditions.
Coming from this experience I can understand why people still prefer ROS1.
I also talked about this with some people from my local north german robot research community and the response was mainly āat the start of the last project we tried ros2, hit some showstopper and then moved back to ROS1ā
To sum this up, ROS2 still has a lot of rough edges and you will currently hit them. Therefore people are still hesitant to make the switch.
First of all @Katherine_Scott donāt get this wrong, this is just curiosity on my side.
The download numbers of packages in the metric report feel off to me, in relation of the amount of bug reports and general activity I am seeing.
Do you have a way to filter these numbers for automated download from e.g. build pipelines ?
It would also be interesting to know how many persons participated in the TSC voting.
@scastro I apologize for being rude with direct personal observations. It was not my intention.
I have also been a researcher (many years ago) and I speak from experience. As a researcher, I have always tried to be as innovative as possible by trying to use cutting-edge tools, not āsomething that works to be fasterā, this is normally how Companies work. But this is just a personal point of view and my personal opinion. Better not to discuss it anymore.
@Katherine_Scott Iām glad that the statistics say that 60% of the community is using ROS 2, my daily experience does not reflect this, but it may be an isolated case.
I know that there are a lot of third-party discussions comparing ROS and ROS 2, focusing on the advantages and disadvantages of both of them, but in your list, you pointed to the only official document that I could find, that is very old (written in 2015, updated in 2017). It was the document that convinced me that it was a good moment to start moving from ROS to ROS 2, but today itās not enough anymore.
Third-party webpages are important, but they can be seen as a āpersonal point of viewā (like mine here). Official updated documents are more important and this is what is missing (yet in my opinion).
A good starting point could be the homepage of the ROS Wiki, the note on the right column is too few (yet in my opinion).
If the goal is to progress faster (which imo. it should be) then this requires:
- Someone to come up with:
- The vision (where do we want this project to go to),
- The strategy to get there,
- Funding to realize that strategy,
- And someone to:
- Translate the strategy into concrete project goals,
- Ensure that exhaustive specifications are written for each goal,
- Set priorities,
- Enthuse people to work on it.
In a regular company, these would be the roles of the CEO and CTO respectively, and āenthuseā would rather be āassignā.
For a community project this is obviously less straight-forward, yet it wonders me that the OSRF does not seem to take on an active role wrt. above points.
Maybe they donāt see this as their core responsibility (which is obviously at their discretion and that of their board). Or maybe they assess their potential impact, or the chances of gathering enough funding, less than I do.
I for one would like to see the OSRF (or some other entity) move in a similar direction as Open Navigation LLC, and actively seek funding, set goals and get them done.
Being still relatively new to ROS, I donāt have a clear overview of the project history, but there seem to have been periods of higher progress pace, and more funding (e.g. the ROSin project). Compared to that, the donate form is a bitā¦ underwhelming.
As for other āconcrete suggestion for changesā: I have plenty, but each of those would only mean extra work for:
The download numbers of packages in the metric report feel off to me, in relation of the amount of bug reports and general activity I am seeing. Do you have a way to filter these numbers for automated download from e.g. build pipelines ?
No, we canāt filter based on automated downloads. Honestly, I think the converse is actually true: the more automated downloads the better. Pipeline downloads show people using ROS in a production system. Given the way the world is moving with Docker, binary downloads arenāt exactly a great metric either. Anyway go nuts.
It would also be interesting to know how many persons participated in the TSC voting.
344 ā This just tells you how many people read ROS Discourse and were interested in voting. Donāt take this the wrong way, but I donāt think most ROS users really care about what happens with TSC elections, just like how maybe 25% of voters show up for off year elections. The vast majority of people are focused on solving their particular problem.
My two biggest issues are āpython nodes eating cpuā and āthe async API used for services and actionsā. I have many cases with service calls inside service calls or inside topic callbacks. And what i am missing is in the tutorials how to handle these cases. It is hard to find examples how to do the programming at least in Python and find out things like the correct way to wait for the result of a service call and so on.
The issue with python nodes eating CPU is that for one message 100Hz or is it with many different messages that totally becomes 100 Hz? I do not have 100 Hz messages but have maybe some 50, 20 and 10 Hz messages handles by the same node. Is that also an issue? I cannot decide if my bad performance is me doing something stupid or if it is rclpy behaving badly.
You are looking for the well hidden Using Callback Groups ā ROS 2 Documentation: Iron documentation
If the SUM of all of processed events is above ~100Hz rclpy will use way to much CPU.
For example we have a node, that publishes one message at startup, and is only subscribed to /clock, running at 100 Hz. This node takes 18% CPU on a Ryzen 3800X.
I share the same sentiment. Technically, the approach that makes ROS 1 successful was the big obstacle to making it a mature production-ready product today. ROS 1 has been successful with the kitchen sink approach, providing everything for a robot developer. This sounds great for the user, but also it is an approach that cannot be sustained in the long run. ROS 2 was a good opportunity to rethink and pay the technical debt, but the design couldnāt show the courage to make the paradigm shift. The resulting product isnāt too different from the original, yet its performance and maturity arenāt great, apparently. Then why the change?
So, the current governance carries this baggage, and they will not admit the failure of not reaching production quality, and they behave reactively to any slight critical questionāone exception was Brian Gerkey in the interview after the acquisition. Numbers and all the statistics they can give, but the lack of enthusiasm couldnāt be hidden. The acquisition could provide a fresh start, but I think it is very unlikely at this point after one year.
In my opinion, the use of such strong wording as āadmit the failureā is inappropriate, and counter-productive.
You seem to have a strong sentiment that the OSRF / TSC / maintainers / etc. have some sort of obligation towards you and the community. But they donāt. They do their best in good conscience to develop great robot software; if you like it, you can use it, but if you donāt like it: thatās ok as well.
If you feel you can do better, and are ready to spend the necessary effort to actually do it and actively participate by writing the software, tooling, documentation or anything else that you feel is missing (or pay someone to do it), then they will for sure be happy to accept your pull requests.
As long as you keep using this kind of abusive wording to express your opinions, I will no longer react to your messages, as it seems your main goal is to satisfy an urge to troll. In which I do not want to participate.