Hi,
At one point I took a serious stab at making the BT navigator more extensible out-of-the-box when I was adding the second navigator type (navigate through poses) to the system. Ultimately though, I found that it wasnât trivially possible. Hopefully the discussion below will provide some useful context.
If you have a use of behavior tree that youâd like to be able to call as a server from another application that does not meet the API of the existing navigators (Nav2Pose / NavThruPoses), the only ârealâ difference between them is the Action Server interface used and how the contents of that are translated into the blackboard for your Behavior Tree to consume the inputs to perform the desired action. The behavior tree system doesnât really care - you could load any arbitrary tree to do any arbitrary thing into either Navigator type, but either youâd be lacking necessary inputs into the BT or lacking the necessary fields in the Action request message to populate.
Thatâs why we have a nav2_behavior_tree::BtActionServer
object which really nicely encapsulates the action server / behavior tree system which can be created for multiple types of Action messages as a template. Within that, there are wrappers around the Action Server and the BT to make interacting with each easier as well. Indeed each ânavigatorâ type is really just an instance of that with the particular ActionT for the request with some event callbacks to interact with the ActionTâs fields (request, response, feedback). Then the BT navigator server is just a dummy host which has a couple of these navigator types with some trivial muxing mechanics to make sure multiple arenât running at once interacting with the same tasks servers (eg Controller, Smoother, Planner).
So, to add a new navigator type, youâd need to basically create a new MyNavigator
type using the BTActionServer
(or I suppose you donât need to, but for simplicityâs sake, lets assume you want to) with your particular custom ActionT
with implementations to process feedback/populate request fields to the blackboard.
Perhaps more in the details than you might think at first, but youâd also need to establish conventions in your behavior tree configuration files as to what ports and what blackboard fields the Action request fields are populated as for your BT nodes to interact with them. If you have a dock_pose
field, youâd need a blackboard entry for that which is standardized for then your BT Node plugins to grab it (or a set of parameterizations for these blackboard IDs). That convention needs to also be reflected in your BT Nodes as well - not just the BT XML files or your new navigator. Thereâs a level of coordination and convention there which I handle for you as a user of the Nav2Pose / NavThroughPoses (though still configurable if you look at the Nav2 docs). This kind of stuff is where it gets tricky to have people create their own navigators and things start to get pretty messy fast if you go into it hap-hazardly.
However, the tools, things like the BTActionServer
are base tools that people can use to create their own BT Navigator
servers and own the full pipeline for customization rather than adding them to the default BT Navigator. I suppose my thinking of this was colored by early conversations and early developments in Nav2 where we literally did have multiple navigator types. Some veterans will remember the nav2_simple_navigator
or discussions about potentially adding in a HSM-based navigator - which never came to fruition due to lack of community developer interest to contribute to (but still on the table). From that background of having > 1
navigator, I didnât think it was prudent to make the BT navigator too messy if the intent was to have multiple parallel replacements available. Afterall, the BT Navigator is unique from the other servers in Nav2 in a number of ways. But a preeminent one is the fact that the BT Navigator doesnât need to be a single server for any serious optimizations reasons. We need multiple algorithms in the same server for Control/Planning/Smoothing/etc because they need to share common resources like TF buffers, low-latency access to a costmap, and similar. The Navigators have no such need so you can have N
navigator servers and thatâs A-OK. The only thing youâd be missing is Muxing, which to be fair is a tangible benefit to having the navigators under one roof, so to speak.
So with that context, do you still want to have them under 1 roof? If so, thereâs little stopping us from doing that. The structure for plugin-izing navigators is largely there already. Itâs a 1-2 day change.
Iâve always imagined things like that being left to the application system. The navigators are to get you from A->B (or A->B->âŚ->Z) but the selection of âAâ and âBâ or what happens after that is up to your autonomy system. I would tier these things by having your Task Allocator (fleet manager, robot-thing-assigner, whatever) which tells you to pick up a thing, drop it off, and return to the staging zone, your Application System which breaks down the task into constituent parts, and your Navigation System which executes the current navigation component. The Application System is commonly modeled either as FSM/HSM/BTs, just like the navigation system is. Typically, I reject the idea of making the navigation BT âtoo smartâ as to not complicate navigation system logic with application logic. However, I understand there may be some situations where you have to do this - but typically when I hear of people doing this, its not (to me) a good way to maintain quality separation of concerns. Now modifying your navigation system is tied into your task-description Behavior Trees, making quite a headache. Especially for handling all the failure cases like what happens if you fail to make it to the first goal, what now? If its all 1 massive Behavior Tree, thatâs going to spider web out massively.
So I would expect most users to need Nav2Pose / NavThroughPoses and then have an application BT / FSM / etc which breaks down the task into constituent components where Navigation is just 1 of potentially many primitives (eg open door, drop off box, wait for user input). Then you have your application system which can semantically handle failure cases at the task level. Its true that this doesnât handle things like complete coverage tasks well, but thatâs definitely a case where a new ActionT
is required at the navigator level and probably also needs a new planner server interface. But thatâs another conversation altogether.