Error handling and contract between rmf core and fleet adapter (#203)

Posted by @cwrx777:

Hi,

What is expected of the RobotCommandHandle APIs to return to RMF core if there is error in the API execution? Or does the core always expect a successful request?

e.g. follow_new_path

  • when api.navigate keep returning False? e.g. network error?
  • in api.is_navigate_completed, when the robot is unable to perform the navigation request after certain retries/timeout due to whatever reason.
    – e.g. in MiR, the queued mission status is Aborted because of blocked path.

Posted by @aaronchongth:

Hello @cwrx777, in these sort of scenarios, an IssueTicket can be created regarding the specific failure or problem.

These issues will be reflected in the robot’s state, rmf_api_msgs/rmf_api_msgs/schemas/robot_state.json at main · open-rmf/rmf_api_msgs · GitHub, until it is dropped or resolved. The manner at which it gets resolved will depend on the user themselves.

Posted by @mxgrey:

I’ll further elaborate that RMF is intentionally designed to not automatically cancel or abort tasks because RMF does not have enough information internally to decide whether a challenge that the robot is facing is severe enough to cancel or abort. RMF leaves it up to system integrators to decide the finer details of error handling, and whether tasks need to be aborted.

There are APIs that your fleet adapter can call to cancel tasks. If you are able to detect a situation where a task needs to be automatically canceled, then as a system integrator you can use those APIs in your fleet adapter or from a separate node that you add to your RMF system.

Posted by @cwrx777:

hi @mxgrey,

Which topic can the fleet adapter subscribe to in order to know the task currently being executed?
How does the IssueTicket get handled by the core or internal fleet adapter implementation, and how does it affect the execution flow of the fleet adapter?

Posted by @mxgrey:

What kind of task information are you looking for? Only the ID of the task, or something about its state?

As of right now the simplest way for a fleet adapter to track the task states is to give a callback to set_update_listener. Your callback will receive fleet_state_update, task_state_update, fleet_log_update, and task_log_update messages as defined in the rmf_api_msgs schemas.

If you want something simpler like to query the current task ID of a certain robot, then it should be possible to add an API for that to RobotUpdateHandle. It would help to understand what your motive and intentions are, and exactly what data you’re looking for.

How does the IssueTicket get handled by the core or internal fleet adapter implementation

It gets added to the issues data inside the robot state. This state information will be visible to the human operators from the dashboard.

how does it affect the execution flow of the fleet adapter?

It has no effect on any execution flow. The expectation is that the human operators can decide what kind of manual intervention is appropriate when they see the issue ticket.

Posted by @cwrx777:

Hi @mxgrey ,

  • Any best practices on the data flow from users’ action (after seeing the issues) to fleet adapter? e.g. user action → SI’s custom message → fleet adapter → cancel/replan/interrupt etc ?
  • Is there any examples/documentation on the fleet adapter APIs for the above?
  • what is the right platform to request for content in multirobotbook?

Posted by @mxgrey:

Any best practices on the data flow from users’ action

The best practice will be highly situational, which is why it’s important to have a system integrator that can collect the requirements and make a judgment about what the flow should be like based on what technology is available, what kind of operations staff is available, and what the operational needs are. One goal of RMF is to not lock anyone into any particular flow because we recognize that there won’t be one right answer for all cases.

I would be happy to write out comprehensive documentation that goes over many different likely combinations of variables that system integrators may have to deal with and provides guidance on possible solutions, but we don’t have any funding source to support that currently. We take every opportunity we can to openly and freely publish the work that we do ourselves, but to comprehensively think through many hypothetical combinations of variables and write up guidance on them would be a huge undertaking, and wouldn’t fit into our margins without a sponsor for it.

Is there any examples/documentation on the fleet adapter APIs for the above?

The rmf_demos_fleet_adapter is the best reference point that we currently have. It certainly doesn’t cover all possible use cases, but it should be a reasonable jumping off point.

what is the right platform to request for content in multirobotbook?

I’d suggest posting an issue in its repo.

Posted by @cwrx777:

As of right now the simplest way for a fleet adapter to track the task states is to give a callback to set_update_listener. Your callback will receive fleet_state_update, task_state_update, fleet_log_update, and task_log_update messages as defined in the rmf_api_msgs schemas.

When server_uri is not specified when launching the fleet adapter, is it intentional that the update_listener callback does not get called?
And does task_state_update and task_log_update currently support update_listener?


Edited by @cwrx777 at 2022-11-08T08:22:45Z