Make bt_navigator extendible

The nav2_bt_navigator is missing extensibility. Other parts of the nav2 stack are very modular,
for example the controller and planner servers are plugin based. Meanwhile, the bt_navigator
has two navigators provided.

Unfortunately the set of available navigators seem not to be extendible
without forking the nav2 stack and rebuilding the package.
Is this a conscious design decision?

I need to attend to multiple different types of tasks with our service robot,
for example go to dock or deliver and return (with an action to complete between the two).
In the package I found no examples regarding the extension of the set of navigators or for
implementing something similar to the above use cases. The readme is really short as well.

I think (correct me if I’m wrong) the best solution would be to use different trees with different navigators for each task. A few reasons for this are:

  • Multiple NavigateToPose actions could be used with the logic implemented elsewhere,
    but the sequence of these atomic tasks makes perfect sense to be in the BT and not elsewhere.
  • The fact that different types of tasks necesitate different navigators seems to be the reason why the two already implemented navigators exist.

The question arises here how one should implement these. In the docs it says it isn’t necessary to modify the nav2_bt_navigator package, but it very clearly is.

Maybe the best design solution would be to create navigators as plugins, that could be loaded the same way as the BT Nodes in the tree. @smac What do you think, do the reasons listed make sense? If so, can you give some guidance on how to implement it?

2 Likes

Hi,

At one point I took a serious stab at making the BT navigator more extensible out-of-the-box when I was adding the second navigator type (navigate through poses) to the system. Ultimately though, I found that it wasn’t trivially possible. Hopefully the discussion below will provide some useful context.

If you have a use of behavior tree that you’d like to be able to call as a server from another application that does not meet the API of the existing navigators (Nav2Pose / NavThruPoses), the only “real” difference between them is the Action Server interface used and how the contents of that are translated into the blackboard for your Behavior Tree to consume the inputs to perform the desired action. The behavior tree system doesn’t really care - you could load any arbitrary tree to do any arbitrary thing into either Navigator type, but either you’d be lacking necessary inputs into the BT or lacking the necessary fields in the Action request message to populate.

That’s why we have a nav2_behavior_tree::BtActionServer object which really nicely encapsulates the action server / behavior tree system which can be created for multiple types of Action messages as a template. Within that, there are wrappers around the Action Server and the BT to make interacting with each easier as well. Indeed each “navigator” type is really just an instance of that with the particular ActionT for the request with some event callbacks to interact with the ActionT’s fields (request, response, feedback). Then the BT navigator server is just a dummy host which has a couple of these navigator types with some trivial muxing mechanics to make sure multiple aren’t running at once interacting with the same tasks servers (eg Controller, Smoother, Planner).

So, to add a new navigator type, you’d need to basically create a new MyNavigator type using the BTActionServer (or I suppose you don’t need to, but for simplicity’s sake, lets assume you want to) with your particular custom ActionT with implementations to process feedback/populate request fields to the blackboard.

Perhaps more in the details than you might think at first, but you’d also need to establish conventions in your behavior tree configuration files as to what ports and what blackboard fields the Action request fields are populated as for your BT nodes to interact with them. If you have a dock_pose field, you’d need a blackboard entry for that which is standardized for then your BT Node plugins to grab it (or a set of parameterizations for these blackboard IDs). That convention needs to also be reflected in your BT Nodes as well - not just the BT XML files or your new navigator. There’s a level of coordination and convention there which I handle for you as a user of the Nav2Pose / NavThroughPoses (though still configurable if you look at the Nav2 docs). This kind of stuff is where it gets tricky to have people create their own navigators and things start to get pretty messy fast if you go into it hap-hazardly.

However, the tools, things like the BTActionServer are base tools that people can use to create their own BT Navigator servers and own the full pipeline for customization rather than adding them to the default BT Navigator. I suppose my thinking of this was colored by early conversations and early developments in Nav2 where we literally did have multiple navigator types. Some veterans will remember the nav2_simple_navigator or discussions about potentially adding in a HSM-based navigator - which never came to fruition due to lack of community developer interest to contribute to (but still on the table). From that background of having > 1 navigator, I didn’t think it was prudent to make the BT navigator too messy if the intent was to have multiple parallel replacements available. Afterall, the BT Navigator is unique from the other servers in Nav2 in a number of ways. But a preeminent one is the fact that the BT Navigator doesn’t need to be a single server for any serious optimizations reasons. We need multiple algorithms in the same server for Control/Planning/Smoothing/etc because they need to share common resources like TF buffers, low-latency access to a costmap, and similar. The Navigators have no such need so you can have N navigator servers and that’s A-OK. The only thing you’d be missing is Muxing, which to be fair is a tangible benefit to having the navigators under one roof, so to speak.

So with that context, do you still want to have them under 1 roof? If so, there’s little stopping us from doing that. The structure for plugin-izing navigators is largely there already. It’s a 1-2 day change.

I’ve always imagined things like that being left to the application system. The navigators are to get you from A->B (or A->B->…->Z) but the selection of ‘A’ and ‘B’ or what happens after that is up to your autonomy system. I would tier these things by having your Task Allocator (fleet manager, robot-thing-assigner, whatever) which tells you to pick up a thing, drop it off, and return to the staging zone, your Application System which breaks down the task into constituent parts, and your Navigation System which executes the current navigation component. The Application System is commonly modeled either as FSM/HSM/BTs, just like the navigation system is. Typically, I reject the idea of making the navigation BT “too smart” as to not complicate navigation system logic with application logic. However, I understand there may be some situations where you have to do this - but typically when I hear of people doing this, its not (to me) a good way to maintain quality separation of concerns. Now modifying your navigation system is tied into your task-description Behavior Trees, making quite a headache. Especially for handling all the failure cases like what happens if you fail to make it to the first goal, what now? If its all 1 massive Behavior Tree, that’s going to spider web out massively.

So I would expect most users to need Nav2Pose / NavThroughPoses and then have an application BT / FSM / etc which breaks down the task into constituent components where Navigation is just 1 of potentially many primitives (eg open door, drop off box, wait for user input). Then you have your application system which can semantically handle failure cases at the task level. Its true that this doesn’t handle things like complete coverage tasks well, but that’s definitely a case where a new ActionT is required at the navigator level and probably also needs a new planner server interface. But that’s another conversation altogether.

With all that said, I will say that I’m not super happy with how the BT Navigator is setup right now. Its on my backlog to refactor the entire thing, but I haven’t come to any final conclusions about what would be best. For right now, its very hierarchically organized, but lacking in some clarity. I feel that some things should be ‘binned’ differently.

But sounds like you’re just asking for plugin-izing Navigators. I’d like to hear your thoughts on this context and especially my final remarks in the comment above as to if the BT Navigator is really the right place for your application to place such logic.

I reject the idea of making the navigation BT “too smart” as to not complicate navigation system logic with application logic. However, I understand there may be some situations where you have to do this - but typically when I hear of people doing this, its not (to me) a good way to maintain quality separation of concerns.

I can just say that I totally agree with this. Probably we want to isolate the subsystems (navigation in this case) if possible.

A few reflections as someone who shares the OP’s view for the most part.

I have a use case where I need my robot to go a very specific path. The path is supplied from outside the system. I want to leverage the BT to trigger logging when the path is begun, and also some autonomous driving to get to the start of the path if the robot is not located there. I wrote a custom planner for taking care of the path, and a couple of BT nodes to set up the behaviour. So far so good. But now, to start the navigation system, I have to give it a goal position. That is just not relevant for my use case. I ended up sending a dud goal, which is unused in the behaviour tree.

In my use case, it would make more sense to have a custom navigator - in particular using a custom action type - to deal with my input. The benefit would be that my custom Planner would not need a bunch of services for dealing with the paths directly, but could instead get them via input. I would be happy if I could supply the action type used for the navigator - no need to make a whole custom engine, pipes and all. Currently, I don’t think it makes much sense as to why there are two different navigators without opening up for more.

While I agree in general that the navigation system should navigate, the BT approach makes it seem like it can do a lot more. Perhaps it has nudged me to making a more complex behaviour tree and asking more of the navigator, rather than making a custom application level node which only uses navigation when it needs it. I will reconsider this for the future.

NB: I am on Foxy, and could not get NavThroughPoses to work properly, which led to this solution.

1 Like

In my opinion, the behavior tree is more than fine with modelling behaviors more complex than
a simple NavToPose, and I’m not sure if I would call that application logic instead of part of the
navigation. On the contrary, the current implementation of the BTNavigator creates a bottleneck.

Really, for most problems all we need is forward the parameters from the action request to the tree,
and get the results. This could be done through pluginization, and for each custom action type
everyone could simply write a navigator plugin to handle their specific parameters.
This would be fast and easy with the provided Navigator class, probably even more so than
having to write a whole “movement FSM” node.

Well, if there was already work put into that, we could go along with it. It wouldn’t hurt the existing
navigators and the user could decide if they want a separate movement manager node with
the existing navigators, or a new navigator with a more complex tree.

1 Like

We’re not suggesting a BT isn’t the right mechanism for it, we’re suggesting that this BT isn’t the right place for it. You can have heirarchical BTs (e.g. BT application which calls Nav2Pose BT as a node of the application BT - which we supply for this very reason).

There’s more to it from that from a designers perspective as a I mention in detail above. There need to be conventions and consistency established to scale cleanly.

I’d like to understand specifically what you’re trying to accomplish that the hierarchical BT approach mentioned isn’t the appropriate solution to your problem for. Both Davide and I are in alignment on that.

I can appreciate that there are edge situations where this is appropriate, but I’m not yet convinced that this is (1) a large enough problem to support out of the box, with all of the complexities that is going to introduce on notion, convention, and documentation (2) that this isn’t going to open up a black hole of users doing “bad” things then blaming Nav2 for allowing them to do it so easily given its against our express intent and (3) that for the few cases that exist that meshing together navigation and application logic is sensible that there aren’t a number of other customizations to the Navigators requiring a professional software team to have to maintain their own Navigator server instance anyway. For the odd cases that exist, most of them only occur when building highly specialized products in highly professional teams, for which its not against our intent at all to “take what you need, leave what you don’t”.

But again, don’t read that as me not open to making these changes (or designing + reviewing a PR to do so), I’m just trying to clarify my position so you can have the benefit of knowing what I’m thinking. Some specificity in your application would be highly useful for motivating your need. I’m not incredibly happy with the setup of the BT Navigator and entourage, so feedback here is highly useful. Keep it coming.

My 2 cents as a navigation veteran who first used BTs with ROS1 and move_base_flex and was SO happy that it was chosen in nav2.

That’s exactly what we end up doing! And the nav2_behavior_tree::BtActionServer object is sufficiently general and very convenient to re-use.
I actually see NavigateToPose as a beautiful example/skeleton of how to use nav2_behavior_tree::BtActionServer, but then I believe it should be the user role to write its custom nav action. Potentially defining its custom feedback, goal and result (though I admire the recent efforts in standardizing them).
I don’t see what more is needed without loosing generality and becoming too specific. And I personally think that NavigateToPose is already bit too much / too complex.

Totally agree. If it feels like something is missing from nav2, it is probably application specific. BTs can and should be used outside (and free of :laughing:) nav2

I think we’re talking about two different problems here:

  1. the navigators being hard-coded and not extendible
  2. separating application logic from navigation logic.

1. The navigators being hard-coded and not extendible

I originally opened the thread because of the first one.

This would’ve been my main point, and to be honest I don’t really understand your response. Could you show me the part you’re referring to?

2. Sparating application logic from navigation logic

This answer is a great example for why the two problems are different. Regardless of the separation of responsibilities, there could be strictly navigation use-cases for which the default (NavigateToPose) parameters are irrelevant and different ones might be needed. Because of this would pluginizing be a decent solution.

As for the separation of concerns, I totally agree with you. Application logic should be handled separately, preferably in a Task Allocator or such node. The real question here is, where is the separating line between application and navigation? In the nav2 stack there is an IsBatteryLow tree node implemented, which I’m not sure if it belongs in the navigation. The same can be said about this discussion, whether the robot should initiate navigation autonomusly.


I would like to move forward with extending the nav2_bt_navigator package to be able to use custom inputs for my behavior trees. Are you interested in pluginizing and collaborating on a PR or should I just create a new package for my own specific problem?

I think these conversations are related because we need a motivating reason to need to make the navigators more extensible which is plausibly a reasonable architectural choice for a mobile robotics application. There are lots of things we could expose but don’t because there’s not a clear reason why you’d want to do X at position Y, when you could do X at Z which is cleaner and more modular.

Its application dependent. I could come up with a few general principles about where I feel that separation lies, but there’s going to be edge-case applications that it doesn’t fully encapsulate. In general, navigation are things related from getting from A to B among some constraints which might include poses, routes, obstacles, etc. Decision making and control to traverse an environment.

Applications are the selection of the “A” and “B” you’re interested in, defining the constraints you care about, and concatenation of many sets of tasks to accomplish a larger more useful goal. Its the stuff that makes the robot useful.

I have no objection to this, I’m just trying to understand your use-case and if this is really the right place for that. It seems like from your original post your interest is in putting application logic into the navigation system (e.g. multi-leg navigation tasks with actions in between like docking and delivering goods) which sounds like all of your application for your product / work. Instead, @facontidavide and I propose that instead you should use a Navigation BT in the BT Navigator for handling navigation requests and then a separate BT for application logic utilizing provided BT nodes like NavigateToPose to navigate in each individual leg to better separate the concerns of your application.

Can you please respond to that discussion thread about why that would not be better or more appropriate? See this from my position, I want to make sure that anything we put in Nav2 is both useful and going to lead to well architected and functional navigation systems. I want to make sure that what you’re asking for is going to not become a thorn in my side with users only using it for anti-patterns creating a poor user experience that then I have to spend resources supporting. I’d like to reiterate that I have no issue with the concept of this work - the details do matter, however, about the motivation and if its justifiable architecturally.

I feel like this is a very fair compromise (e.g. show me the utility and we can get it in there within a few weeks).

Our project consists of deliver and return, and a few similar complexity tasks. I wanted to avoid having to write a TaskAllocator package (like nav2_bt_navigator) because it felt like overkill for such simple tasks. But you’re right, we could and probably will do that.

However I still feel weird about this solution. There are a few use cases that are not really meant for the two hard-coded navigators because of the mismatch between used parameters. I mentioned them above:

Could you tell me your thoughts on these?

Hi,

Not every node we provide is meant for low-level navigation solely. The Nav2BehaviorTree package is a set of behavior tree nodes which are useful for navigation. That particular BT node (along with NavigateToPose and others) are more application focused, but still in the domain of mobile robot navigation and has general re-use, so it is included. This is the same as other things like OpenDoor or CallElevator or DockRobot if we can abstract and generalize them for folks to have available as generic building blocks. But I think the IsBatteryLow BT node is a bit of an aside not really super important to the discussion at hand. That package is just a set of building blocks for mobile robotics - they come in all shapes and sizes.

The use case that @peredwardsson addressed is a decent point, but I also don’t have a particular issue with folks not using every field in a message as long as the message can be used to facilitate the intended action. There are commonly fields of messages (like intensities in LaserScan or even behavior_tree in NavigateToPose) which are unpopulated by a given sensor driver or request where it is not provided or relevant.

But if instead there was information that you wanted to communicate that you could not with the current API (such as providing that pre-computed path to the Navigation request) that’s a very different story and would require an API change that I would support. That might be semantically more clean for your application as well @peredwardsson with some custom behavior trees to process it.

So all in all, I think that would be a use-case I can see as being of clear utility and necessity to justify plugin-izing the BT Navigator.


The only artificial(ish) restriction I would like to place on this is that these navigators must be behavior tree action servers (e.g. the base class Navigator should include that BTActionServer<> and the API surrounds that assumption. I don’t want this to become an abstract Navigator server with other types of navigators added with too abstract of an interface. Additional types of navigators (hard-coded, FSM, etc) are definitely on the table for Nav2, but should be implemented in their own servers / packages to keep the architecture of the entire stack to be “small, sharp tools”.

I’d be open to discussing the details of the design in a ticket if you wanted to file one @redvinaa

I would like to contribute but in reality it depends on how much work it would need.
Could you share with me the code that has already been done?

I also have some questions that might be relevant here.
First, looking at the implementation of the NavigateToPose BT node, it doesn’t look
finished. I think that’s just the boiler plate but the actual action call is not
implemented. Am I missing something here?

Second, I’m curious how the feedback from the NavigateToPose BT node is propagated
through the Navigator, or more generally, how would it handle nested behavior tree
nodes with action feedback?
There seem to be a few options here, for example using the blackboard or
creating a new topic between the navigator and the BT node. But I suppose
the NavigateToPose BT node implementation would provide an answer to this.

I mean by that is that we have an abstract navigator base class that the others derive from. That can be used as the foundation of the plugin class and the derived classes would need registration as plugin instances and some updates to how they are handled in parameterization / initialization in the server. Its largely done already in the architecture of the code. The only questions relate to naming and if it is sensible to have another lower-base class to implement the plugin in case the current navigator base class is too specific.

Yes, check out the header files, they derive from a BtActionNode member that does the boilerplate shared between different action task server handling in the BT Nodes.

The feedback is computed in the navigator - see the navigator’s source code. The feedback may take information from the blackboard to compute the feedback, much of the feedback is actually computed locally based on blackboard entries and other metadata. Always happy to add more feedback if there are things of value to a client, but what we have now is pretty decent.

Sorry for the late reply. The NavigateToPose BT plugin really works.

As for the pluginization, I opened this ticket to discuss implementation details.

1 Like