[REP-2014] RFC - Benchmarking performance in ROS 2

Thanks everyone for the feedback provided in here so far :+1: . We’ll be discussing the input in the upcoming Hardware Acceleration WG, meeting #13 (LinkedIn event). I’ll particularly block a big chunk of the meeting for it and prepare a summary of most relevan items discussed above. We’ll go through each one of them to collect the group’s input. Bring questions and/or additional thoughts to the meeting please!

A few remarks from my side from the discussion above:

I believe there’s no need to relax those sentences and I heard nobody except you that on a first read felt that way (in fact it took a further clarification for us to follow your argument), thereby I’d like to hear more feedback about this. The message being conveyed is important to educate the reader and stresses the principle that robots are deterministic machines and their performance should be understood by considering various metrics". In my view performance in robotics does not equal throughput (or any other metric in an isolated manner) and this document should instruct how to benchmark performance in robotics. This is also specially important to educate roboticists about compute architectures and hardware acceleration (i.e. there’s no single accelerator that solves all cases and things need to properly assessed).

This is a wrong understanding of ros2_tracing and to our group’s experience, the project can be easily extended to support other tracing frameworks (you just need to make sure to meet CTF if you wish to merge/mix traces). The fact that it currently only supports LTTng is due to limited resources (and us all not contributing enough). Note that instrumentation is defined through a series of headers and preprocessor directives which allow you to abstract aways OS, frameworks, etc. Have a look at how we instrumented image_pipeline ROS 2 package for reference.

QNX’s SAT is what you’re be looking for and it can be easily enabled in ros2_tracing (that said, I hear some people’s using LTTng in QNX). We’re also working with Microsoft’s folks to try and align REP-2014 with Windows ROS 2 deployments.

I think this argument isn’t valid. And again, ROS 2 is already instrumented with ros2_tracing for a reason. Let’s not reinvent the wheel for business’ interests.

This is also wrong (and very much!). Though we can’t claim holistic support between all different heterogeneous hardware (there’s no such a thing, unfortunately) ros2_tracing can be extended and used to provide an understanding of the interaction between different heterogeneous hardware. The lowest hanging fruit is leveraging LTTng-HSA, which allows using easily ros2_tracing on AMD GPUs. We’re are also in the process of extending ROBOTCORE Framework (which implements REP-2009, among other things) to support tracing across various accelerators:

Nevertheless, what makes me confused is how we’re twisting the argument in here. Above, you claimed repeatedly that we had to focus on benchmarking at the input/output of test subjects and we discussed how ros2_tracing can do that perfectly fine and more efficiently than other mechanisms. Now you seem to care about introspection and try to discard it for that reason (which is really hard to argue with the amount of research supporting ros2_tracing and LTTng).

@debjit it’d be great if you, RTI and other DDS vendors could have a look at REP-2014, share feedback and try using it down the road for benchmarking performance of DDS. We all know about the situation that happened not that long ago while comparing open source DDS implementations.
REP-2014 can help address future issues in this direction.

1 Like