ROS2 Foxy & RMW Fast DDS: Improved Intra-process & Inter-process performance

Thank you for your compliments @rex-schilasky

The intra-process and inter-process latencies do not differ that much for the 2.1.0_sync mode

It may not be clear in the graphs, but the results do have a substantial difference from intra-process to inter-process. For the last dot in the lines, that correspond to the PointCloud8m data-type, latency compares 2.65 vs 3.68. But the important thing is that, apart from having slightly better latency, the number of samples transmitted compares 730 vs 208

Is there a need to copy memory for intra-process communication in sync mode at all or wouldn’t it make sense to just “forward the allocated memory” to the connected subscriber?

This is exactly what 2.1.0 is bringing in. When the publisher and the subscriber are on the same process, payloads are not copied, and reference counts are updated instead.

Maybe that doesn’t matter for the next “zero copy” release anymore.

The zero-copy mechanism mainly involves API extensions allowing the user to get pointers into the payload buffers (both on the publication and the subscription side). It will only be available for POD types though, as the buffer being returned would be to the serialized payload buffer, in order to avoid (de)serialization of the data-type.

With these new API extensions in place, we would have intra-process zero-copy available. We are also working on an inter-process data-sharing mechanism, in which the payloads will be created on a shared memory segment / mapped file. The combination of these two mechanisms will provide inter-process zero-copy.

2 Likes