[Nav2] Request for Comment -- Route Server

This work does not aim to support multiple robot systems for central or decentralized fleet management. This assumes we’re only caring about the behavior of a single robot. The technology in this field for multirobot planning / fleet management / traffic conflict resolution isn’t mature or generalizable enough for us to include into Nav2 at this time.

I’d like to selfishly push back on this limitation since my whole motive here is to do exactly this—allow the ROS2 nav stack to be directly compatible with multi-agent planning frameworks—and I believe the route server that’s being discussed here can be an excellent foundation for achieving that if we just squeeze one simple interface requirement into it, which I’m about to describe.

I had mentioned supporting a variety of events—e.g. passing through automated doors and using elevators—as part of the navigation problem. In RMF, one of such navigation event is “wait for traffic”. A multi-agent traffic planner can make a determination about where a robot should wait to avoid traffic conflicts and then monitor the situation on the ground to signal when the robot is clear to proceed. This is conceptually not too different from telling a robot to wait in front of a door and then signalling to the robot when the door is open so the robot can proceed. If we can encode this idea of waiting on events into the nav stack in a generic, extensible way then the nav stack could easily support optional multi-agent coordination in a way that is not intrusive at all. I’ll start with this very rough diagram that represents a specific chunk of the nav stack within my proposal [source]:
RouteServer

I would propose this for the (minimal) output of the route server:

# nav2_msgs/Route.msg
nav_msgs/Path path
uint32[] checkpoints

The path field is obvious: a sequence of poses that the local planner should treat as goals. The checkpoints field would be an array of indices of path.poses where the robot should pause to wait for a signal that it is allowed to proceed. If checkpoints is empty then the robot can immediately traverse the whole path without stopping or waiting. The signal for permission to proceed at checkpoints could look like this:

# nav2_msgs/Clearance.msg
Header header
uint32 for_path
boolean[] checkpoints

E.g. If Route.checkpoints contains [1, 4, 7] then the robot may need to pause when it arrives at path.poses[1], path.poses[4], and path.poses[7] depending on the values in the latest Clearance.checkpoints:

  • [false, false, false]: Pause at 1
  • [true, false, false]: Proceed past 1 but pause at 4
  • [true, false, true]: Proceed past 1 but pause at 4
  • [true, true, true]: Proceed through all checkpoints

If a checkpoint’s clearance is true before the robot arrives then the robot does not need to pause or even come to a stop at that point. If no Clearance message has arrived then the controller must assume a fully false array. We would also have the following constraints on the content of the Clearance message:

  • Clearance.for_path must match Route.path.header.seq
  • Clearance.header.seq must increment for each subsequent update (but can reset to 0 for each new value of Clearance.for_path)
  • The size of Clearance.checkpoints must match the size of its corresponding Route.checkpoints or else it is treated as a fully false array
  • The elements inside Clearance.checkpoints must not revert from true values to false values in subsequent updates. That means the node determining clearance must not issue a true value for a checkpoint until the robot is permanently guaranteed to have clearance at that checkpoint.

At the same time, the route server can publish a separate message describing the nature/purpose of those checkpoints so that some separate event handler node can watch the progress of the robot and handle relevant events. This separate message describing the checkpoints could be standardized, but it could also be a custom message determined by the user’s choice of a route server plugin. Example for a very generic message that could potentially be standardized:

# nav2_msgs/GenericCheckpoint.msg

# Key for how to interpret the description of this checkpoint,
# e.g. "door", "elevator", "traffic"
string category

# Description of the checkpoint which depends on the category, e.g.:
# * door: the name of the door
# * elevator: a json message describing the name of the elevator and floor of entry
# * traffic: a json message describing what other robots need to be waited on at this checkpoint
string description
# nav2_msgs/GenericCheckpoints.msg
uint32 for_path
GenericCheckpoint[] descriptions

The event handler node would listen for checkpoint description messages for the current route and track the progress of the robot along that route to determine:

  • What commands to send to doors, elevators, etc (and when to send them)
  • When to update the checkpoint clearance for the controller

Why are checkpoints handled by the controller server instead of being handled by the waypoint follower?

There’s some conceptual overlap between what I’m proposing for checkpoints and what already exists for following waypoints, but I think these should be handled separately for the following reasons:

  • The input to the Waypoint Follower is decided at the application layer whereas the checkpoints I’m proposing are inferred by the Route Server when finding a solution to incoming navigation requests
  • Checkpoints might require the robot to come to a stop or might not depending on exactly when clearance is given. For the smoothest possible behavior, the controller server itself should be aware of checkpoints and clearances so it can make quick decisions about what velocities to command.

Why aren’t these checkpoint events determined by a behavior tree?

My understanding (which I invite others to correct) is that the behavior trees in the nav2 stack are made by humans based on the desired behaviors for their application. I believe that concern is orthogonal to determining when checkpoints are needed, since checkpoints are inferred while finding a solution to the navigation problem.

Perhaps the proposed Event Handler node could accept behavior trees that describe how to handle the different checkpoint categories, but ultimately I believe that the problem of figuring out when/where/what checkpoints need to exist has to be solved by the route server based on information provided by the navigation graph.

Is there a precedent for this proposal?

What I’m proposing aligns very nicely with the VDA5050 industry standard. While it’s true that a VDA5050 bridge was recently released for ROS2, I believe that bridge could be significantly simplified and improved by incorporating this proposal into the nav2 stack.

This proposal is also based heavily off of my experience in implementing traffic and event management in Open-RMF. The current implementation of Open-RMF suffers from a lot of unnecessary traffic stoppages that could be eliminated if the proposed checkpoint system could be incorporated directly into the controller server.