Experiment to inhibit DDS and ROS2 child threads

y-okumura-isp · May 27, 2020, 2:51am

Abstract

In our previous post, we reported DDS child thread and scheduling policy affect wake up latency.
In summary, RR-TS setting improve wake up latency than RR-RR, where

RR-TS means that main thread is Round-Robin schedule (real-time) and child threads are Time-Sharing schedul(non real-time),
RR-RR means that both main thread and child threads are RR.

We will report on the effect of higher priority process on communication in the next post.

I report in this post.

Motivation
Child threads stop in RR-TS if there is other thread/process with higher priority than child thread
If DDS is implemented as child thread does any communication process, DDS cannot send topic in such a situation.

Therefore we experiment to inhibit the child process by higher priority process.

Experiment Condition

Our setup is following:

Software Stack, see previous post in detail.
- Hardware: Raspberry Pi B+ with careful tuning
- OS : ubuntu 18.04 4.19.55-rt24-v7+
- ROS Distro : ROS 2 Eloquent Elusor
Benchmark program
- Ping-pong program.
  Ping sender wakeups periodically, sends ping topic and listen pong topic.
  Pong sender subscribes ping topic, replies pong to pong topic.
- We implement ping-pong program in 3 pattern.
  - 1 process with 1 node (1e1n)
  - 1 process with 2 nodes (1e2n). Nodes are ping-sender node and pong-sender node.
  - 2 processes (2e). 2 Processes means ping-sender process and pong-sender process.
  - I used SingleThreadedExecutor.
Inhibition program
- large image bitwise_not program
Metrics: we measure following in benchmark program
- (M1) Time until benchmark program ends
- (M2) Timing from wake up to ping sent:
  This increases when publish is blocked,
  because rclcpp::Publisher::publish calls blocking function rcl_publish.
- (M3) ping-pong RTT:
  If rcl_publish blocks, this metrics becomes wrong too.
  if rcl_publish does not block and this metrics is wrong, publishing pong may block.
- (M4) difference between last wake-up time and current wake-up time:
  Check this to detect too long sleep.
Experimentation Protocol
- Run benchmark program with RR-TS schedule. The priority of RR is 98.
  Immediately run inhibition program in RR schedule in the same core. The priority is 90.
  So, schedule priority is “main thread” > “inhibition program” > “DDS child thread”.
- We tuned that benchmark program and inhibition program finish within 1 minute,
  so if child thread is completely stopped by inhibition program, it takes 2 minuts to finish benchmark program.
- We measured 9 times under each condition and summarize the results.

I plan to publish codes of benchmark program and inhibition program.

Result Summary

Our result is following.
For 1 executor, 1 node or 2 nodes were unaffected.

number of executors	DDS	result
1	CycloneDDS	no effect by inhibition program
1	FastRTPS	no effect by inhibition program
2	CycloneDDS	no effect by inhibition program
2	FastRTPS	3 effect patterns

I describe details.

1 executor: cyclonedds and FastRTPS

Result
- The benchmark program finished in about 1 minute.
Comment
- In 1 executor, the topic communication looks like to be implemented as intra process communication.
  We capture network devices by wireshark, and found no data packet are communicated.
  It’s seems UNIX socet nor shared-memory are used, “copy” or “move” may be used internally in DDS.

2 executor in cyclonedds

Reslt
- The benchmark program finished in about 1 minute.
Comment
- Order of programs is important.
- When we start inhibition program before benchmark program, benchmark program blocks until inhibitation program finishes.
  (I think child thread is used for negotiation or discovery)

2 executor in FastRTPS

There is 3 patterns.

(1) ping blocks
- It takes 2 minutes to finish benchmark program.
- More precisely, it takes 1 minutes to send specific 1 ping.
(2) 2 minutes, but unknown reason
- It takes 2 minites to finish benchmark program.
- But the reason is unknown.
  (M2) The worst latency from wake-up to ping-sent is 100 [us]-order which is almost same in no inhibition program.
  (M3) The worst ping-pong RTT is 3 ms, no problem.
  (M4) The worst value is 1 second, which is too large bacause ping-sender wakes 10ms priod.
  But the worst value came out only once, so this is not the cause.
(3) no effect
- The benchmark program finished in about 1 minute.

For (1), I plot (M2) i.e. time between wake-up and publish-sent, with/without inhibition program.
Y-axis is value of (M2), and x-axis is the number of loop.
With inhibition program, max (M1) become 59,736,187 [us] at x ≈ 500 which is almost 1 minute

Without inhibition program
With inhibition program
|

tw_2exec_pub_ping_pong_1_rrts_without_task97.log

Each occured twice, twice, and 5 times.
My expectations are:

Child threads are used for control communication in FastRTPS
- If child threads used for data packet, (1) happens every time.
So timing matters?
If both inhibition program and control communication run simultaneously, pattern (1) or (2) happen.

To investigate the cause of (2) is a future work.

Conclusion

We experimented how higher priority process affects RR-TS child threads.

If you use only 1 process, don’t care.
If you multiple processes, don’t use higher priority process/thread then child threds in the same CPU core.
For example, use RR(97) for main thread, RR(97 or 96) for child process, and more lower priority or TS for other process/threads.
- If CPU cores are different then there is no affect (I didn’t mention in this post).
I think MultiThreadedExecutor or other executor under development such as cbg or let executor may use thread for callbacks and use priority (As I remember, cbg executor uses priority).
I don’t know whether such exectors care DDS child thread scheduling, priority and CPU core, I hope out posts will be useful.

Topic		Replies	Views
Threaded Callback with priority, affinity and overrun handler Next Generation ROS real-time	6	5600	September 14, 2023
ROS2 generated child thread scheduling policy affects timers Quality Assurance ros2 , raspberrypi , dds , eloquent	2	7639	May 25, 2020
ROS 2 Real-time Working Group Online Meeting 18 - May 26, 2020 - Meeting Minutes Next Generation ROS wg-real-time	20	3445	September 1, 2020
Latency and throughput in ROS2 Next Generation ROS	24	17529	January 7, 2021
ROS2 latency using different node setups General	31	9792	June 17, 2021