Do you mean the action of switching, or that there are a bunch of features/characteristics missing? If features, what’s missing in SDF?

I believe switching to any new format would require significant effort.

libsdformat has conversion from URDF to SDF, and the reverse is partially

complete. This would make a transition less onerous.

What are the items that SDF is missing?

SDF elements are already chosen by committee. That committee happens to be

gazebo developers. Interested parties are welcome to propose changes, and

comment on pull requests. Are you talking about a more formal committee?

@hauptmech i mean the act of switching

@Nate_Koenig yes - switching to any new format would require significant effort, which is why I’m suggesting not switching.

I do mean that I the committee would have to be not just gazebo people. Yes, anyone can propose changes and offer input, but for something so crucial as a robot description format, I think that the make up of the committee that has actual decision making power should reflect the people who use it, and so if URDF switch to SDF, then i would expect the committee with decision making power for sdf to include people from developers for the various tools that now use URDF, as well as the developers for gazebo.

To be clear I’m not asking for that - I’m bringing it up as an example of why I don’t think we should switch from URDF to SDF. I think that SDF is an excellent format created by engineers who know their stuff and who have a clear application in mind that guides their decisions.

I’m still really struggling to see how the graph causes issues. Does anyone have any example cases? This feels like it’s straightforward and a non-issue to me and given the people here I must be missing something.

Here’s my assumptions:

Loading a graph into a tree with a hand written parser can be done with a brute force loop detection with just a few lines of code, and given the typical number of loops there is no performance problem. So no penalty for the little guy.

If one is working with a large graph structure that would cause a performance impact, then they have probably have the ability to write more elegant loop detection.

All spanning trees are equivalent and it never matters which one is given to TF.

Where am I going wrong?

How does the computation for breaking the loops in the graph interact with TF and change TF’s performance?

What naive algorithms for breaking a graph into a spanning tree are not deterministic and not constant in time? Can you come up with some code that gives non-deterministic results when applied successively to the same graph or any expected evolution of that graph? Is this just a theoretical problem? (I understand a tree topology change causes a real problem).

How much error are you really going to accumulate (or save) on any robots in current (or in URDF2’s future) use with TF, with a non-optimum spanning tree?

I think that the rigid body assumption (or the mathematical nature of a frame) means that there is no spanning tree that is better than any other from a topological view.

From a problem solving view there is a benefit if I can get the upstream tools to use the same spanning tree as my tools (maybe this is what you meant about matching topology to physical linkages?) and we have a solution for that in annotated graphs. I didn’t see any proposals, but I would expect that a ‘loop’ element, or something similar would close the loop and all tools reading ‘joint’ elements would just naturally see the spanning tree that is the ‘best’ from the users view.

You are right. Thanks for keeping me honest here.

The following is meant only to close the loop on switching between URDF and SDF inside ROS and Gazebo.

This discussion is primarily from the perspective of ROS, as it should be. From the Gazebo side, we have no plans on switching away from SDF. That’s not to say we wouldn’t, but SDF is more than sufficient for Gazebo’s needs.

SDF may not be the best solution for ROS. If not, I’m sure converters will be generated to make both systems work together. The transport systems are also incompatible, and everything still works.

Long story short, ROS should pick the format that is most ideal for their users, and Jon’s suggestion of defining a set of specs sounds like a great idea. That may have been mentioned in a different thread.

Can you give an example of how this situation happens? It might be the missing piece of why I’m not seeing a problem with graphs and TF in practice.

There are two types of performances that can be effected. First if you end up with a deeper graph lookups will cost more. And secondly the accuracy can be significantly changed if you accumulate across a long chain of uncertain links when there’s a much more accurate traverse across a small number of fixed links. In that case you can loose many significant figures of accuracy.

Well I’d suggest considering a graph that is a circle. Naively I can cut it anywhere. If I cut it between A and B the accuracy of the transform from A to B goes from the accuracy of that one measurement, to the accuracy of the N measurements from the rest of the circle compounded. An example where the topology might change entirely is if my heuristic would change the root element.

An example of this would be might be if I had a 4 bar linkage with corners A, B, C and D. In the absence of other information arbitrarily corner A was selected as the root for the spanning tree initially. If my algorithm is picking the minimum spanning tree on the metric of spanning length. If I add a connection between any two other corners of the 4 bar linkage. Say corners B and C then the spanning calculation will want to choose the root to be either B and C since they have direct connections to all other nodes whereas A has to connect to B or C to get to D.

Computing a consistent spanning sets also requires full knowledge of the whole graph. One of the great things about trees in a distributed system is that as long as you have information about the part of the tree that you are on you do not need to know about other parts of the tree. If you compute the spanning tree of the graph based on partial knowledge you can easily get very different results. Thus if you rely on computing the spanning tree from the graph you are required to have a fully consistent state of the graph across all actors to get the same result.

I don’t know how to answer this. It really depends on your system. You can get errors which are the difference between a functional system and a non-functional system.

Making something up If you have a 512 count shaft encoder on a 1m lever arm, one tick of error can give 12mm of error. If you have two arms you could get to 24mm of error. If you traverse either of these encoders due to a non optimal tree you gain many orders of magnitude of error instead of the usual floating point rounding errors for a known fixed transform. This might happen for example in the above reconfiguration of the 4 bar linkage if you walk around the 4 bar linkage instead of taking the fixed transform between the two grounded corners.

If you’re doing symbolic math sure all spanning trees are equivalent. But this assumption breaks down in the presence of uncertainty.

Take the example of a simple triangle. If I measured the 3 angles and the leg lengths. The I drew each leg using the offset angle and the leg length sequentially I would not get back to exactly to the same position as the starting point. If the different angles are measured with different levels of uncertainty I would get different amounts of error depending on where I started to traverse the triangle and which direction I traversed. Since the data is redundant, if I skipped the least accurate measurement I’ll likely get the best result. If we have different levels of uncertainty in measurements the results can be notably different.

Clearly it would be possible to compute a better solution by taking advantage of the inside angles formula for the triangles and trying to estimate a better solution prior to trying to draw it. Then there would only be the inaccuracy of drawing it. But that requires me to know a lot more semantic information about the system including models of uncertainty to be able to solve for the converged solution. This is something that a simulator can take advantage of in it’s physics solver since it knows about all the parameters of all the joints and the physics engine is designed to resolve these loop inconsistencies. However without full knowledge of the system, like a simulator has, this sort of problem cannot be resolved generically.

@tfoote Thanks for the elaboration. So far I have exactly the same understanding as you.

I feel a little more confident in saying:

It really feels like this is an imagined problem. As soon as you introduce the actual robot topologies that all current and most future users will have, and you add determinism created by the combination of any given loop detector code and any given description file (in which the joints will be processed in a known order and that order won’t change on the time scale of an experiment or operations session) then there is not really any problem.

I agree, though ‘many significant figures’ is a bit hand-wavy. So for all the robots, current and future, what’s the worst case depth? And if you support 95% of them, then what’s the worst depth? I don’t know all the robots, but I bet it’s not that deep.

I think we are not talking about this type of error (call it measurement error?). Nor are we talking about the error created by a solver failing to converge. We are concerned only with the influence of where kinematic loops are broken on the numerical error accumulated by TF calculations.

Can anyone actually come up with a real scenario where a real problem occurs? Bonus points if the device/mechanism has been built. Temporarily off limits are very large reconfigurable systems (let’s say >100 links) and tethered systems incorrectly modeled as rigid body approximations.

It’s would be a shame if you guys got hung up here (or wasted time on overly complex solutions) if there wasn’t really a problem.

@hauptmech measurement error accumulates based on the the choice of kinematic chain, so I’m not sure why you don’t think this example applies. There are a huge number of robots in the world that use ROS, and it is amazing how even the smallest software change upstream will break things for downstream users. Usually in ways you don’t even expect, but in this case we have an example of how we know things could be worse. But I do agree that this isn’t necessarily a show stopper. Adding one or more elements to allow specifying the spanning tree for TF to use could solve this.

@hauptmech measurement error accumulates based on the the choice of kinematic chain, so I’m not sure why you don’t think this example applies.

I could be wrong. Here’s my thinking.

TF only works with a tree topology. For a rigid body mechanism with a graph topology, TF can only ever be a reporter of transforms, it can never substitute as a model for the mechanism itself. The model of the mechanism with it’s kinematic loops and the solvers used to work with that mechanism will be somewhere else. Measurements will feed into the solvers which will work with the graph and any minimization of how measurement error propagates through will be up to the solvers. After any sub-phase of calculations, some chunk of code will take the best estimate of the current state of the frames and copy them into the TF.

Thus TF is not dealing with any measurement error in a parallel system.

Am I missing something?

Ah, I see what you mean. In the case where the entire loop is solved for by one solver and the transforms are then published to TF, you’re right. In most robots that I’ve dealt with, however, there are many nodes publishing transforms for different sets of joints, and the job of stringing together a chain of transforms is done by TF in the client (or the buffer_server if you’re using that in TF2). In that case, when presented with transforms for a loop/cycle of joints and links, if TF is queried for a transform that has multiple hops, it has to choose how to traverse the graph and accumulate transform. If different joints have different encoders (or the same encoders but different errors because the encoders are on the motors and the gear ratios to the joints are different) the overall error will depend on the path TF chooses to traverse the graph.

Using a dedicated solver node that handles loops in a smart way will be something some users will want to setup, but being able to just tell TF “use this spanning tree, and add it up for me” would also be something that I’d want.

Sorry, I completely skipped a step in my example. Various nodes publish various subsets of joints, but the transforms are (on many robots) all published by the robot_state_publisher. It reads the URDF and the joint_states topic, and computes the transforms. But since robot_state publisher doesn’t know anything about which encoders are best, and since it isn’t doing any fancy optimization, it just recursively computes the transforms out through the tree defined by the URDF. It is the robot_state_publisher which needs to know how to resolve cycles, unless you want to run a separate node that does the transform computation for all joints involved in the cycle. This is where I think it would be good to have the option to specify the spanning tree explicitly.

COLLADA - Is there anyone that has had success with this format in a robotics context? Anything beyond storing meshes?

The pro I’ve seen for Collada is that it’s a standard that belongs to a bigger, external community.

I keep seeing Collada get mentioned but then not receive more support. It may be worth detailing why it’s not a good choice so that it stays buried.

OpenRAVE uses COLLADA extensively, and there are converters between URDF and COLLADA.

Does anyone know of any real-world hand creation/editing of COLLADA files? Or is it for all practical purposes a machine format? In which case the use of openrave xml, rather than COLLADA, is what we are really talking about.

I tried crafting a simple single face mesh for unit tests. It was pretty harrowing, and finally I resorted to exporting one from Blender and simplifying the result instead. It is not a very friendly format for hand-editing in my opinion.

/edit: Note that that I was trying to hand-craft a mesh which isn’t reasonable anyway. It may be that describing kinematic structures (lets not say chains) is much more human friendly.