Talk:Usingzero-copy data transfer in ROS 2

We at Apex.AI recently integrated Eclipse iceoryx in ROS 2 Galactic. This allows application developers to use zero- copy memory transfer (under certain conditions). This is of particular interest if large data such has point clouds has to be transmitted.

This talk Using zero-copy data transfer in ROS 2 - Virtual Eclipse Community Meetup - Crowdcast will show how to use shared memory in ROS 2 by enabling iceoryx in CycloneDDS.

Everyone is welcome to join or contact me for additional questions.

Date: 20 July 2021, 17:00 CEST (tomorrow)

11 Likes

thanks for the heads-up :+1: I am interested in this framework.

quick question as architecture perspective. I’ve been going through the source code, but cannot figure that out.

what if Roudi daemon gets crashes? how can we recover the system using shared memory? AFAIK, we need to restart the entire system including the all application using iceoryx. is my understanding correct?

our requirement is more like distributed system, fail independently. do you have any good practice to share on this use case?

thanks in advance :smiley:

1 Like

rmw_iceoryx was updated to iceoryx v1.0.1 for ROS 2 Foxy. It’s far from finished, but feel free to give it a try rmw_iceoryx.

2 Likes

@ZhenshengLee

we’ve already tried to use rmw_iceoryx and then got this question.

what if Roudi daemon gets crashes? how can we recover the system using shared memory? AFAIK, we need to restart the entire system including the all application using iceoryx. is my understanding correct?

probably yes then? i just would like to confirm the architecture and design for our use case.

thanks

Thanks for your question. Iceoryx is a tool that provides and only provides shm based IPC, which is true zero copy. It uses a centralized memory pool so that every process can access data.
The existence of Roudi daemon of course suggests this centralized arch of software.

I suggest that you:

  1. read the docs of iceoryx and to know the design goal of iceoryx.
  2. go to the github web page of iceoryx and create an issue about fault tolerance strategy.

Thanks.

@tomoyafujita the daemon has pros and cons. We discuss this a lot, but we stick to it for the time being. Sure it is a single point of failure. For me this is comparable to a part of the operating system or other local services that run on one device. It has another dimension if you have a central broker like in MQTT that is needed to enable communication between different devices. Indeed the RouDi daemon is not needed for the actual communication. It does the following

  • Single point of configuration. In iceoryx each subscriber can have an own queue size and different number of samples that are hold on user side. The needed size of shared memory cannot be derived per publisher but is more an overall consideration.
  • Rights management. The daemon creates the shared memory partitions and can configure who is allowed to read or write which partition.
  • It provides built-in topics for introspection, debugging and discovery. This could maybe also be solved without a daemon, at least DDS implementation have similar functionality without a daemon. As far as I know ROS 2 also starts a daemon when you use the command line interface. Having a central instance that aggregates the information makes things easier.
  • Monitoring. As applications could crash while having loan for sample in shared memory, we use the daemon to do a monitoring and clean up
1 Like

@michael-poehnl

thanks for the explanation :+1:

the daemon has pros and cons. We discuss this a lot, but we stick to it for the time being. Sure it is a single point of failure.

yeah, understood. i did not mention single point failure is not good, i think that just depends on use cases.

appreciate for your detailed description :grinning_face_with_smiling_eyes:

@tomoyafujita Sorry for the late reply. As it was already pointed out the middleware daemon RouDi is a single point of failure. If it is shut down, everything relying on it has to be restarted (all applications depending on it will also be shut down. This holds if it crashes as well, but of course this should not happen. These aspects are a topic for a talk of its own. :slight_smile:

There is a certain robustness guarantee in iceoryx in the sense that crashing applications will not lead to state corruption or blocking indefinitely at the dameon since the algorithms involved are lock-free. This will not necessarily hold if used within e.g. Cyclone DDS as there are still locks.

Not having RouDi as a single point of failure is a challenge. At its core it is a memory manager with lock-free transmission queues. It also keeps track of all the applications that exist and use its shared memory and some other details. To decentralize this, one would have to replicate this information at application side but there always must be some kind of consensus of the system state (which is fairly large). While this is doable it has a lot of overhead and would negate many performance advantages which are the main point of its existence.

In other words, we have to accept this single point of failure and think of the daemon as something at almost the same level as some internal OS process.

Regarding the talk.

It is unfortunate that the live demo did not work, but you should be able to recreate anything I intended to show with the repository mentioned in the talk: GitHub - ApexAI/ros2_shm_demo: Demonstrate how to use zero-copy Shared Memory data transfer in a single independent example. The middleware used is Eclipse CycloneDDS which integrates Eclipse iceoryx for Shared Memory transfer.

For the use of the introspection see rmw_cyclonedds/shared_memory_support.md at master · ros2/rmw_cyclonedds · GitHub This allows you to verify shared memory is used while running the example.

I would like to upload a pdf containing the the slides if this is ok (and considered useful) but I do not have permission in ROS discourse to do so, but this should be correctable. There where also questions at the end of the talk I could not answer or missed entirely, I will expand on this in a separate post.

Regarding the questions in the talk, it is rather suboptimal that crowd-cast does not allow for discussion after the talk. I therefore missed some of them entirely but I looked them up afterwards and maybe the answers are still of use.

  1. Will sensor_msgs/Image be carried via ZeroCopy transport

No since sensor_msgs/Image contains a string they are not fixed (or bounded) size. We may support data types like this in some future iteration but it can never be as performant as using some fixed size type since any non-fixed size type will require a serialization operation (which a fixed-size type does not). The reason is that the data may reside in external memory in a format not suitable for shared memory transfer. I suggest using a specifically crafted fixed size message type instead.

  1. If one is currently using ROS 2’s intra-process communication to achieve zero-copy transfer, is there a benefit in switching to this method?

Intra-process communication is more restricted, as it works only in the same process. Therefore this does not compete with shared memory usage which is used used for inter-process communication within the same machine (intra-machine).
I do not know the implementation but since under the hood intra-process communication can be as simple as exchanging a pointer I think it would be faster than communication via iceoryx. But as soon as we communicate between processes for sufficiently large/frequent messages there are definitely performance benefits in using iceoryx.

  1. Is ZeroCopy transportation can applicable to the docker container? For example, communication between host-container or data transportation over multiple containers.

No, each docker container essentially acts as a separate machine, so shared memory communication does not work across docker containers. The reason is POSIX shared memory resources are only visible in the container itself, not in other containers. Furthermore the middleware dameon has to run within the machine which uses shared memory and running it in one container it is not able to manage applications/memory in another.

  1. How do these SharedMemory settings interact with the standard ROS quality of service?

The default ROS QoS settings are Reliable, KeepLast, Volatile. These settings are supported with iceoryx (up to some limit for KeepLast). For an application there is virtually no difference in the reception of messages but potentially a performance gain.

  1. Is iceoryx a version of cyclone dds?

iceoryx is an independent project but is used in Cyclone DDS. There is a project to use iceoryx without cyclonedds in ROS 2 GitHub - ros2/rmw_iceoryx: rmw implementation for iceoryx You can also use iceoryx without ROS 2 as an inter-process communication mechanism.

  1. Is 64Kb the message fragmentation limit for cyclone dds ?

This cannot be fully explained as of now, the 64kB boundary requires further investigation and more benchmark data. At the moment we have no conclusive data as to why we see this increasing gap between network and Shared Memory transfer above 64kB message size. The current explanation is that a UDP datagram has at most this size. We also have to consider that sending a message always incurs some basic cost (waitset, semaphores, executor etc.), regardless of its size. There should be a point where this effort gets negligible compared to serialization and deserialization in the network case, but the exact point depends on many factors.

Furthermore I believe throughput already benefits from shared memory at much smaller (but frequent) messages, but the benchmark does not cover this. There is definitely some optimization potential but there is no doubt that in the case where it is supported shared memory support leads to performance gains.

2 Likes

Are you 100% sure regarding this? I have mounted /dev/shm on different containers before.

1 Like

Same, taking care to setup the container with ipc sharing, which is similar to network sharing. The tl;dr is set ipc=host and everything uses the same /dev/shm. I’ve yet to find a boundary worth crossing from a container that couldn’t be crossed…

1 Like

@mkillat @russkel @mvollrath

I do not know the implementation but since under the hood intra-process communication can be as simple as exchanging a pointer I think it would be faster than communication via iceoryx.

besides performance, I think the followings are benefits for intra-process communication.

  • saving TLB entries.
  • less system calls including pagefault. (because using the same virtual address, no need to map/unmap)

Are you 100% sure regarding this? I have mounted /dev/shm on different containers before.

Same, taking care to setup the container with ipc sharing, which is similar to network sharing. The tl;dr is set ipc=host and everything uses the same /dev/shm.

i was going to ask the same question, i think this can be done with following.

  • bind /dev/shm to container
  • bind /dev/mqueue to container

those are just namespces, set ipc=host does all the tricks. see Docker run reference | Docker Docs

I’ve yet to find a boundary worth crossing from a container that couldn’t be crossed…

I would not do just give out the access to IPC from container, since there are other system applications relying on it. but if we can specify the filesystem with access permission to enable shared memory transport, that would be useful.

I just explain what we can, probably not what we want to do :grinning_face_with_smiling_eyes:

thanks

1 Like

@russkel I am not absolutely sure I have to admit, but we have used different docker containers to simulate a network and we could not communicate with RouDi due to the way dev/shm was mounted (in container 1 only). Maybe with additional configuration it is possible, if someone knows more let me know.

The daemon RouDi has to run somewhere though, and I would assume it would be one container, call it container 1. In this case it naturally sees the shared memory in this container 1. If we could mount this memory in container 1 also in container 2 it would likely work.

I do not have enough knowledge about docker containers to say for sure. Maybe someone with more docker knowledge can tell whether and how this is possible. After all I have read in this topic I think it might be feasible, we just never explored the use case.

A pdf version of the slides I used (with minor corrections) is available here
Using_Zero_Copy_In_ROS2.pdf

1 Like

I touch on this a bit in my own question and answer below, but one alternative to using the host for ipc is to use a shareable ipc namespace with a donor container that you can use to link other containers together with ontainer: <_name-or-ID_>.

1 Like

@ruffsl

that makes sense, providing isolation based on linux name space would work.

thanks :+1:

Hello, I created a repo ros2_shm_msgs for zero copy with pointcloud and image, and just finished a point cloud transport demo.

Feel free to give issues!

Thanks.

2 Likes

Communication between different docker container using shared memory is possible. The /dev/shm has to be mounted in every docker container and shared memory based communication should work. At least we made this using eCAL’s IPC. For iceoryx we never checked it, it could be that for RouDi there is an additional mechanism / configuration needed.

Awesome, thanks @ZhenshengLee