[URDF-NG] ROS2 URDF2 discussion

What does the next version of URDF look like?

There was a discussion in the google group around URDF and SDF in 2.0.

Later a live meeting was held. Summary notes are here.

The discussion so for seems to be leaning towards harmonizing URDF and SDF (using SDF as a foundation).

What are the next steps? What are the important details?

1 Like

My bold position is that ROS could adopt SDF wholesale. Most, if not all, of the wishlist items for URDF2 are already captured in SDF. It would be a shame to reinvent the wheel.

I have a feeling this idea is somewhat controversial. If so, would someone be willing to speak to potential issues associated with ROS using SDF?

1 Like

It would be a shame to reinvent the wheel.

Didnā€™t SDF do that, given that people were trying different ways to extend URDF already (mimic tags, srdf, etc.)? :stuck_out_tongue:

Jokes aside, I think SDFā€™s ability to have frames with multiple parents in the kinematic chains is a distinct and fundamental advantage over URDF, which happens to also be a requirement of sorts when working with physic engines. However, I do think it shows that there are appropriate situations in which yet another standard makes sense.

If so, would someone be willing to speak to potential issues associated with ROS using SDF?

In order to use SDF in ROS, there needs to be a way to do tf with graph (rather than tree) based kinematics, i.e. how to handle multi-parent reference frames with tf. This is brought up every time this is suggested and though some ideas have been discussed in person, thereā€™s never been a well formed proposal on how to deal with that.

Specifically @jon and I have talked at length about spanning trees for the SDF graph and how they could be used to address this. Maybe @jon can speak to that point.

There may be other issues, but thatā€™s the fundamental one that comes to mind for me.

1 Like

Wheels are strange beasts, and like to be reinvented. Gazebo did have an XML format for describing robots and worlds before ROS existed. In fact, URDF looks surprisingly like Gazeboā€™s original formatā€¦

The graph issue is a good point. Adding something to SDF to break a graph into a tree sounds like a reasonable approach.

An impopular position perhaps, but personally I would really like to see ROS2 use an established scene/robot description format if possible.

SDF is a good improvement over URDF already, but itā€™s still a custom format, with almost no uptake outside the ROS-Gazebo universe. I know of one (export only?) plugin for a (commercial) 3D modelling tool. Newer Gazebo versions have the model editor, but that is still limited.

With ROS2 using an established and industry grade middleware as the foundation to build its communication abstraction on top of, it would be great to see if we can do something similar for some other key parts, the robot description format being one of them.

So I have just consumed most, if not all, of the discussion so far (hurray for lazy Fridays) including the recording of the meeting. Here are a few observations:

All the formats seem to have arisen from fire-fighting small problems in small problem domains rather than intentional design. Everyone participating in the last year of discussions is representing their stakeholders and their requirements; but the conversations seems to be ā€œHow can we make the current robot description techniques less broken?ā€ rather than ā€œWhat set of stakeholders and requirements will give us the best foundation for the future?ā€

Given the number of people volunteering to do anything, moving to and incrementing on SDF seems like the best bet.

At the heart of a robot description, the most important thing is that the mental model, and any given element of that model, is documented in enough detail so that you, and I, and everyone else understand the context and agree on what the element is. After that we need a defined data representation, with a common, easy to implement format being nice to have.

Iā€™m seeing a general confusion between configuration (robot A has a gripper and robot B does not) and the dynamic state of things (robot A picks up an object and we add a rigid constraint between the pose of the EE and object to represent that). There is nothing wrong with bootstrapping any dynamic models with the robot description, but they are a different thing.

There was some discussion about moving the robot_description parameter to a latching topic. Fine idea. Outside the scope of a robot description format.

The robot description is a storage/wire format. There is a point in time when all data retrieval and parsing is finished at which point there is a robot model in memory. What happens after that point is application specific. Declaring which end effectors are available is part of the robot description. Signaling the switching of tools or the mounting of a tool at t=0 is not part of a robot description.

There has been some discussion on compose-ability, extensibility, and whether a core description (or well defined sub-descriptions) should be used. Dependency hell issues were raised with regards to plugins, URIā€™s and the like.

I donā€™t have anything new to add to that discussion. The core description thing seems to be an artificial issue created by assumptions about the responsibility of the tools using the description to their downstream users. The robot description should not care about those third party consumers. Any parseable and coherent subset of the robot description should be fine.

Dependency hell is not a reason to disallow dependencies. If managing external connectivity or packaging of multiple elements is beyond your project, then you should keep everything in one file.

There was discussion about xacro, template engines, and GUI editors. I think these are outside the scope of the robot description, beyond indicating a format that has good libraries available.

One final observation is that while it appears that SDF is the easiest path, the SDF creators have a mental frame that constrains their thinking. Nothing wrong with that, but itā€™s worth taking a careful survey of others using rigid body robot models. People deploying robots in experiments or the field. People transferring CAD or optimizer designs. I think most types of users have been participating in the discussions so thatā€™s good.

Iā€™m really hitting a mental wall with why a graph topology is a problem. Which frames have multiple parents (or rather, how is it possible for a frame to have multiple parents)? Can you elaborate the problem?

Since itā€™s a rigid body model it seems like any spanning tree would give you a valid results. The default spanning tree via the order joints are in the file would be fine no?

(EDITED to clarify the question)

Iā€™m guessing @wjwwood is thinking of kinematic structures with cycles in them, like grippers with parallel / coupled links, delta robots, etc.

What formats do you have in mind?

Open source projects are difficult to track. Based on word of mouth and interactions with other people I can say that simulators, such as Moby and Drake, use SDF. Organizations like FIRST, Robocup, and NASA use SDF. There is also a solidworks to SDF exporter, and AutoCAD has expressed interest in developing their own exporter. None of this means SDF is good, but are people who rely upon SDF.

Can you clarify what you mean by ā€œcustom formatā€ and ā€œestablished formatā€. To me it seems that a format created for a particular project (letā€™s say SDF and URDF) can become an established format through general acceptance. This acceptance is an indicator that a format is worth using, or improving upon. Whereas a format created by an organizing body (letā€™s say Collada) does not make the format good.

Yes, it is limited. The approach has bee to develop and release incremental improvements, rather than wait until a feature complete version is ready. Weā€™d love to have help with the model editor.

Just to clarify, both SDF and URDF have been carefully crafted by groups of people. Itā€™s difficult to foresee all potential use cases and problems. What may seem like fire-fighting is the normal processes of incremental improvements.

Yes, tf requires a tree representation, but many robot configurations can have cycles. Itā€™s also jsut a good idea to support representation of a graph as a tree.

I would say the only other one we should consider is collada. This is because it is an industry 4.0 standard.

If SDF is it then we need push to get it as an industry 4.0 standard as well

What is involved in becoming an industry 4.0 standard?

I have not looked into it.

Paul Hvass or Shaun Edwards might know more about this.

Gjis?

Yes.

One strategy would be to pick an arbitrary spanning tree (first in order, last in order, random, etc.). However, you could also let the user specify the spanning tree, something that @jon was pretty interested in doing. Also, when you are dealing with a distributed tf system you can easily get into a situation where two spanning trees can disagree. This isnā€™t an issue when the description is being used in a simulation because the simulation ensures that the spanning trees agree (or at least it should) and itā€™s able to do so because it is not distributed and it has a ā€œperfectā€ model.

So allowing the user to pick, with a reasonable default if they donā€™t care, seems best. But there are a lot of details about how to represent this and communicate it in the ROS graph and in the API.

Both of the previous two points, by the way, are examples of why I have to disagree with your general sentiment that how the description is transmitted and used in context is out of scope. The description isnā€™t used in a vacuum and I donā€™t think it should be designed in one either. I do understand the desire for the format to be portable to different frameworks and therefore itā€™s design shouldnā€™t be unduly influenced by just one of those possible frameworks, but I think you can accomplish that while considering how it will be used if youā€™re conscious of that fact.

A spanning tree export is certainly possible, and something that I think we should explore. But a naive spanning tree computation will lead to significantly degraded performance of tools like tf. As mentioned, it needs to be consistent between all participants. For good performace the tree needs to be both deterministic and also consistent over time. If the tree changes topology you loose the ability to interpolate between time updates. Thereā€™s also value in having the tree be similar in topology to the physical linkages since traversing fewer joints to compute a transform will result in less error accumulation.

My opinion on what should be done in the short term has changed over the past few months. While I still find it annoying that Gazebo and ROS do not have a common format, I donā€™t believe that switching to SDF for all ROS applications (or vice versa) makes sense, because:

  • Switching existing tools that use URDF over to SDF would require a large effort. Tons of tools/libraries/etc use URDF, and switching would be non-trivial.
  • Switching to SDF doesnā€™t provide most of the things that I want out of a robot data exchange format
  • As @wjwwood and @tfoote have pointed out, TF and associated tools need to have a tree in order to work. There are solutions (provide a way to specify a root node for the tree, or to specify the entire tree), but my feeling is that this is only the first of many edge cases that would be run into.
  • The existing tools for taking a robot described by a URDF, converting it to SDF, and spawning it in gazebo work ā€œwell enoughā€ for me.
  • This isnā€™t just a question of urdf/sdf. On top of those I have an SRDF for moveit, yaml files for describing other robot-specific formats that arenā€™t in any spec yet, etc.

On top of these technical issues, there is the higher level decision of who decides what goes in the spec, and what the process is for that. One reason we have so many different robot description formats is that many of the interested parties prefer to have the flexibility to decide on their own what goes into a format. For TF/moveit/rviz/etc to share the same description format with gazebo (for example) people from all of those groups would need to have input into the spec. @nate_koenig would you be ok with changes to sdformat being chosen by a committee, where gazebo was only one of several participants?

Longer term, I do think that defining a format (or set of formats) that are broadly used in the robotics community is extremely important, but Iā€™ll post some thoughts about that separately in [URDF-NG] Next-generation robot descriptions

EDIT: Removed reference to ā€œlong term sectionā€ - Iā€™m going to post those thoughts on the next gen robot description thread.

Do you mean the action of switching, or that there are a bunch of features/characteristics missing? If features, whatā€™s missing in SDF?

I believe switching to any new format would require significant effort.
libsdformat has conversion from URDF to SDF, and the reverse is partially
complete. This would make a transition less onerous.

What are the items that SDF is missing?

SDF elements are already chosen by committee. That committee happens to be
gazebo developers. Interested parties are welcome to propose changes, and
comment on pull requests. Are you talking about a more formal committee?

@hauptmech i mean the act of switching

@Nate_Koenig yes - switching to any new format would require significant effort, which is why Iā€™m suggesting not switching.

I do mean that I the committee would have to be not just gazebo people. Yes, anyone can propose changes and offer input, but for something so crucial as a robot description format, I think that the make up of the committee that has actual decision making power should reflect the people who use it, and so if URDF switch to SDF, then i would expect the committee with decision making power for sdf to include people from developers for the various tools that now use URDF, as well as the developers for gazebo.

To be clear Iā€™m not asking for that - Iā€™m bringing it up as an example of why I donā€™t think we should switch from URDF to SDF. I think that SDF is an excellent format created by engineers who know their stuff and who have a clear application in mind that guides their decisions.

Iā€™m still really struggling to see how the graph causes issues. Does anyone have any example cases? This feels like itā€™s straightforward and a non-issue to me and given the people here I must be missing something.

Hereā€™s my assumptions:
Loading a graph into a tree with a hand written parser can be done with a brute force loop detection with just a few lines of code, and given the typical number of loops there is no performance problem. So no penalty for the little guy.
If one is working with a large graph structure that would cause a performance impact, then they have probably have the ability to write more elegant loop detection.
All spanning trees are equivalent and it never matters which one is given to TF.

Where am I going wrong?

How does the computation for breaking the loops in the graph interact with TF and change TFā€™s performance?

What naive algorithms for breaking a graph into a spanning tree are not deterministic and not constant in time? Can you come up with some code that gives non-deterministic results when applied successively to the same graph or any expected evolution of that graph? Is this just a theoretical problem? (I understand a tree topology change causes a real problem).

How much error are you really going to accumulate (or save) on any robots in current (or in URDF2ā€™s future) use with TF, with a non-optimum spanning tree?

I think that the rigid body assumption (or the mathematical nature of a frame) means that there is no spanning tree that is better than any other from a topological view.

From a problem solving view there is a benefit if I can get the upstream tools to use the same spanning tree as my tools (maybe this is what you meant about matching topology to physical linkages?) and we have a solution for that in annotated graphs. I didnā€™t see any proposals, but I would expect that a ā€˜loopā€™ element, or something similar would close the loop and all tools reading ā€˜jointā€™ elements would just naturally see the spanning tree that is the ā€˜bestā€™ from the users view.