At one point I took a serious stab at making the BT navigator more extensible out-of-the-box when I was adding the second navigator type (navigate through poses) to the system. Ultimately though, I found that it wasn’t trivially possible. Hopefully the discussion below will provide some useful context.
If you have a use of behavior tree that you’d like to be able to call as a server from another application that does not meet the API of the existing navigators (Nav2Pose / NavThruPoses), the only “real” difference between them is the Action Server interface used and how the contents of that are translated into the blackboard for your Behavior Tree to consume the inputs to perform the desired action. The behavior tree system doesn’t really care - you could load any arbitrary tree to do any arbitrary thing into either Navigator type, but either you’d be lacking necessary inputs into the BT or lacking the necessary fields in the Action request message to populate.
That’s why we have a
nav2_behavior_tree::BtActionServer object which really nicely encapsulates the action server / behavior tree system which can be created for multiple types of Action messages as a template. Within that, there are wrappers around the Action Server and the BT to make interacting with each easier as well. Indeed each “navigator” type is really just an instance of that with the particular ActionT for the request with some event callbacks to interact with the ActionT’s fields (request, response, feedback). Then the BT navigator server is just a dummy host which has a couple of these navigator types with some trivial muxing mechanics to make sure multiple aren’t running at once interacting with the same tasks servers (eg Controller, Smoother, Planner).
So, to add a new navigator type, you’d need to basically create a new
MyNavigator type using the
BTActionServer (or I suppose you don’t need to, but for simplicity’s sake, lets assume you want to) with your particular custom
ActionT with implementations to process feedback/populate request fields to the blackboard.
Perhaps more in the details than you might think at first, but you’d also need to establish conventions in your behavior tree configuration files as to what ports and what blackboard fields the Action request fields are populated as for your BT nodes to interact with them. If you have a
dock_pose field, you’d need a blackboard entry for that which is standardized for then your BT Node plugins to grab it (or a set of parameterizations for these blackboard IDs). That convention needs to also be reflected in your BT Nodes as well - not just the BT XML files or your new navigator. There’s a level of coordination and convention there which I handle for you as a user of the Nav2Pose / NavThroughPoses (though still configurable if you look at the Nav2 docs). This kind of stuff is where it gets tricky to have people create their own navigators and things start to get pretty messy fast if you go into it hap-hazardly.
However, the tools, things like the
BTActionServer are base tools that people can use to create their own
BT Navigator servers and own the full pipeline for customization rather than adding them to the default BT Navigator. I suppose my thinking of this was colored by early conversations and early developments in Nav2 where we literally did have multiple navigator types. Some veterans will remember the
nav2_simple_navigator or discussions about potentially adding in a HSM-based navigator - which never came to fruition due to lack of community developer interest to contribute to (but still on the table). From that background of having
> 1 navigator, I didn’t think it was prudent to make the BT navigator too messy if the intent was to have multiple parallel replacements available. Afterall, the BT Navigator is unique from the other servers in Nav2 in a number of ways. But a preeminent one is the fact that the BT Navigator doesn’t need to be a single server for any serious optimizations reasons. We need multiple algorithms in the same server for Control/Planning/Smoothing/etc because they need to share common resources like TF buffers, low-latency access to a costmap, and similar. The Navigators have no such need so you can have
N navigator servers and that’s A-OK. The only thing you’d be missing is Muxing, which to be fair is a tangible benefit to having the navigators under one roof, so to speak.
So with that context, do you still want to have them under 1 roof? If so, there’s little stopping us from doing that. The structure for plugin-izing navigators is largely there already. It’s a 1-2 day change.
I’ve always imagined things like that being left to the application system. The navigators are to get you from A->B (or A->B->…->Z) but the selection of ‘A’ and ‘B’ or what happens after that is up to your autonomy system. I would tier these things by having your Task Allocator (fleet manager, robot-thing-assigner, whatever) which tells you to pick up a thing, drop it off, and return to the staging zone, your Application System which breaks down the task into constituent parts, and your Navigation System which executes the current navigation component. The Application System is commonly modeled either as FSM/HSM/BTs, just like the navigation system is. Typically, I reject the idea of making the navigation BT “too smart” as to not complicate navigation system logic with application logic. However, I understand there may be some situations where you have to do this - but typically when I hear of people doing this, its not (to me) a good way to maintain quality separation of concerns. Now modifying your navigation system is tied into your task-description Behavior Trees, making quite a headache. Especially for handling all the failure cases like what happens if you fail to make it to the first goal, what now? If its all 1 massive Behavior Tree, that’s going to spider web out massively.
So I would expect most users to need Nav2Pose / NavThroughPoses and then have an application BT / FSM / etc which breaks down the task into constituent components where Navigation is just 1 of potentially many primitives (eg open door, drop off box, wait for user input). Then you have your application system which can semantically handle failure cases at the task level. Its true that this doesn’t handle things like complete coverage tasks well, but that’s definitely a case where a new
ActionT is required at the navigator level and probably also needs a new planner server interface. But that’s another conversation altogether.