Can the thread number of MultiThreadExecutor be infinite? What're the cons of long-running callbacks?

Hi forum,

Since AI and heterogeneous computing like GPU has been widely used in robots and the fullpath inferring time is always long but may yield and cost few CPU cycles.

There may come long-running callbacks which occupy MultiThreadExecutor threads for a relatively long time and may starve other ready callbacks.

Is there any option to cope with this, e.g., more executor threads and dividing “big” callbacks into “small” ones?

Intuitively, I choose more executor threads because the thread that executes a long-running callback may yield and cost little CPU time.

But I’m concerned about the increasing thread number of MultiThreadExecutor can significantly affect overall performance.

I would recommend you do not have long running tasks in your ros callbacks. Instead you should have your own threads for long running tasks. Increasing the number of threads in the executor is a poor solution to this, because as you feared increasing the threads will not scale and the long running tasks could still starve other callbacks from running and cause the node to become unresponsive. Instead create your own threads for the work. If you want to use ros callbacks still you can create a thread and a dedicated single threaded executor to run just the long running tasks. See: examples/rclcpp/executors/cbg_executor at master · ros2/examples · GitHub



Sorry, could you please offer more detailed information about “node unresponsive”?

that is said, if all threads in that executer on application callback for long running task. there will be no more threads in the executer to take the events (topic, service, action, etc) even if they are ready to be taken. this eventually leads node unresponsive.

What if I yield the thread that executes a long-running task and open enough threads?

this does not help either. because threads are created on the executor statically.
you can yield in the callback, but it does not dispatch the new thread to execute the new event.
the threads with the executor will be back to running state to keep executing application callback.

Instead you should have your own threads for long running tasks.

yield in the executor threads works in this case.
because this will give other threads (your own threads) to have chance to be running by scheduler.

CC: @wjwwood

1 Like

Sorry, my words may be misleading.

I mean creating enough executor threads when instantiating MultiThreadExecutor.

As for long-running but non-computation-intensive callbacks like launching CNN operation to GPU, I can yield or sleep the executor thread to release the CPU so that other executor threads can exploit the CPU time. As a result, the long-running callbacks don’t block others and CPU time is not wasted.

I’ve done some experiments to measure the overhead of yield, but the overhead seems to be trivial. I‘d like to know if there are any other factors making the method mentioned above unavailable.

I am not really sure how it interacts with GPU from CPU system perspective. but CPU thread is just waiting for the GPU task, it makes sense to call cpu_yield as you mentioned.

1 Like