Minimising ROS2 Discovery Traffic

Hello @gavanderhoorn,

What we can easily provide are the throughput and RTT (Latency = RTT/2) measures for the zenoh master. The graph below shows the throughput we get on an AMD Ryzen workstation. As you can see the peak throughput is around 60GBps. Please notice that these is the throughput while going through the loopback interface over TCP/IP.

The two different graph show the throughput for the zenoh-net and the higher level zenoh API. It may be insightful to look into the code of the examples we use for measuring throughput – please see (zenoh/zn_pub_thr.rs at master · eclipse-zenoh/zenoh · GitHub, zenoh/zn_sub_thr.rs at master · eclipse-zenoh/zenoh · GitHub, zenoh/z_sub_thr.rs at master · eclipse-zenoh/zenoh · GitHub, zenoh/z_put_thr.rs at master · eclipse-zenoh/zenoh · GitHub).

If you skim through the code you’ll notice how the code used to test performance does not try to play tricks or take shortcuts. This is the code how you’d write in your application after having looked at the Getting Started guide. In other terms, we try to make performance as accessible as possible. If you wonder what is our behaviour across the network, when we measure throughput over a 10Gpbs network, the only difference we see from the localhost is that the throughput saturates at 10Gbps. For the rest, what remains the same is that we saturate a 1Gbps network at 128 bytes payload and a 10Gbps network at about 1024 bytes. We are writing a blog on performance where we’ll share all this data along with the performance of our zero copy. If you can wait a bit, we’ll share a pretty throughout analysis in one week or so.

For what concerns Round Trip Time (RTT), we usually measure it for a fixed size, 64 bytes, and for increasing publication frequencies.

We think this is more relevant than the usual RTT test shown in performance evaluations which is essentially the same as the inf in our x-axis (meaning as fast can you can). In essence by looking at latency at different publication periods you can more clearly see how caches will impact the actual latency experienced by your application – also notice that real applications rarely write as fast as possible.

Anyway, as you can see from the graph the RTT gets down to 40 micro-secs quite rapidly – in other terms 20 micro-secs latency.

In conclusion, this are the raw zenoh performance which hopefully will give you an idea of what zenoh may add as an overhead when bridging DDS data over the network.

Let me know if you have further questions.

Take Care!

3 Likes