Posted by @rushelby123:
Hi everyone,
First of all, thanks to @ItamarEliakim-R and @mxgrey I’ve found previous discussions on similar topics (#469 , #521, #594) very insightful.
I’m relatively new to the RMF community and I’m currently working on a project that involves error handling and task recovery. I would like to open this discussion to better understand how such situations are typically managed within RMF-based systems and whether there are recommended practices or built-in mechanisms to deal with them.
To make things more concrete, let’s consider a simple example:
Suppose we have two robots available to perform a task. The dispatcher assigns the task to one of them, but during execution, that robot becomes unable to complete it (due to a fault, blockage, etc.).
A reasonable recovery plan might be to automatically reschedule the task and try to assign it to the other robot before cancelling it entirely.
From what I’ve read so far, RMF doesn’t provide this level of error recovery out-of-the-box, and it seems up to the user to implement such mechanisms externally.
Could anyone from the community share:
-
Whether there are updates or ongoing work on this topic?
-
How are people currently approaching task failure and reassignment?
-
Are there any recommended patterns or extensions for this?
I believe this could be a valuable reference for many users dealing with fault tolerance in multi-robot deployments.
Looking forward to hearing your thoughts and experiences
Edited by @rushelby123 at 2025-05-13T10:38:42Z