ROS2 speed

Patrick · March 24, 2025, 11:45am

Thank you so much for this detailed review. This means ROS1 is still at least three times faster compared to ROS2 in this example. That is quite a disappointing result, especially after it’s been known for a couple of years know.

I’d say that’s a feeling that a huge number of ROS users share and it should get much higher priority. I’ve been teaching ROS1 and ROS2 since more then 10 years now and ROS2 still brings much more trouble in debugging communication problems, which is just frustrating for newcomers and also far beyond their scope. (On top of some other hurdles for beginners including the build process, especially for Python.)

So +1 to make ROS2 just as easy to use and implement as ROS1 was.

Chuck_Claunch · March 24, 2025, 2:22pm

Forgive me if this exists and I was too lazy to find it- is there an architecture diagram somewhere that follows the path of a message through from one node to another, including any potential serialization, all the way down to the network protocol (or local system shared memory) in use? It’d be great to get this drawing with ROS1 and compare to a few of the different RMWs in ROS2 just to get a starting point. Also a comparison of this drawing between custom types vs primitives would be helpful.

paulwetzel · March 24, 2025, 2:51pm

Would be really interesting to take a look, if something like this exists (maybe there is some sort of a ROS2 whitepaper?)

lambdaprime · March 25, 2025, 1:02am

https://design.ros2.org/ is a place for ROS design documents
ROS on DDS is a good document covering design decisions

lambdaprime · March 25, 2025, 1:11am

I think main innovation in ROS2 is not necessary DDS but RMW (see ROS 2 middleware interface design document).

It allows to decouple ROS2 from DDS/Zenoh or any other protocol. Which brings me to the idea that ROS2 can implement new RMW which would be based on TCPROS (which is ROS1 is based on) so people who prefer ROS1 for out-of-the box performance can use it.

There is RMW based on SMTP already, so why not have RMW based on TCPROS?

Another idea is that default ROS2 DDS profile is not suitable for loads with small messages when latency is important.

johnwason · March 25, 2025, 6:06pm

I don’t know if this is helpful for the discussion, but I completed some naïve performance testing a few years ago to compare ROS 2 to another project. The results are on page 6: https://files2.wasontech.com/RobotRaconteur_CASE2023.pdf . RTT time for ROS 2 was around 100-300 us during my tests but with a high deviation. The source code is here: GitHub - johnwason/rr_ros_latency_tests

I can find the original dataset if people are interested.

johnwason · March 25, 2025, 8:15pm

Another consideration is that the latency may be caused by the context switch performance of Linux. I have noticed at times the latency to switch threads when new data arrives can be the problem rather than the performance of the communication or the code. The context switching performance is affected by the power settings of the computer and a myriad of kernel settings. Anecdotally I have found that Windows has better default context switch performance when receiving small amounts of data at high frequencies, but this is due to the scheduler prioritizing this scenario. Linux in my experience can have pretty high latency when trying to context switch rapidly for receiving small amounts of data.

Hugal31 · March 26, 2025, 9:06am

I also did a quick benchmark of the serialization alone (I’m probably not the first though).

It’s a bit rough, so take it with a grain of salt:

Noetic

Run on (12 X 4213.38 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x6)
  L1 Instruction 32 KiB (x6)
  L2 Unified 256 KiB (x6)
  L3 Unified 12288 KiB (x1)
Load Average: 0.56, 0.59, 0.88
---------------------------------------------------------------------------------------------------------------------------
Benchmark                                                  Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------------------------------------
BM_serialization/std_msgs_header                        35.4 ns         35.4 ns     20223211 bytes_per_second=565.588M/s
BM_serialization/std_msgs_bool                          31.8 ns         31.8 ns     20964098 bytes_per_second=30.019M/s
BM_serialization/geometry_msgs_pose_array                883 ns          883 ns       806462 bytes_per_second=11.8422G/s

Galactic

Using rmw_fastrtps_cpp.

Run on (12 X 4500 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x6)
  L1 Instruction 32 KiB (x6)
  L2 Unified 256 KiB (x6)
  L3 Unified 12288 KiB (x1)
Load Average: 0.36, 0.65, 0.94
------------------------------------------------------------------------------------------------------------
Benchmark                                                  Time             CPU   Iterations UserCounters...
------------------------------------------------------------------------------------------------------------
BM_serialization/std_msgs_header                        1166 ns         1166 ns       595924 bytes_per_second=17.9911M/s
BM_serialization/std_msgs_bool                          1155 ns         1155 ns       606509 bytes_per_second=6.60702M/s
BM_serialization/geometry_msgs_pose_array               8226 ns         8225 ns        84909 bytes_per_second=1.27132G/s

There’s quite a big overhead in ROS2, which appears to be rmw_serialize calling get_message_typesupport_handle and reconstructing the MessageTypeSupport structure. Maybe caching those information would be a low-hanging fruit?

ZhenshengLee · March 26, 2025, 11:06am

Seeing the performance improvement with cpp replacing python.Introducing rosidlcpp: Building interface packages 10x faster - ROS Projects - ROS Discourse

I suggest the ros2cli tools reimplemented with cpp. like ROS2 speed - #20 by gbiggs

If PMC create a project named ros2clicpp, I’d like to contribute.

Topic		Replies	Views
ROS2 latency using different node setups General	31	9748	June 17, 2021
Why does the running speed of ros1 software become slower when ported to ros2? Next Generation ROS ros2 , ros	1	169	January 7, 2025
Diagnostic-aggregator and diagnostic-updater porting to ROS2 Next Generation ROS ros2	15	5261	January 24, 2019
Looking for a consultant experienced in ROS2, Docker, and data throughput issues Jobs ros2 , docker	8	2447	May 18, 2021
Unreliable communication using executors Next Generation ROS ros2 , ardent	3	950	April 24, 2018

ROS2 speed

Noetic

Galactic

Related topics