Proposed standard interface for aerial vehicles in ROS

Continuing the discussion of developing standard messages for flying vehicles I have iterated on our last discussion:!topic/ros-sig-mav/f12m3mnqYl4

And have put together two REP pull-requests that I would like feedback on.

Update to REP 105

The first one is an extension to REP 105 which defines conventions for interacting in larger spaces
Thanks to @joq for the early feedback on that PR.

REP 147 A Standard interface for Aerial Vehicles

The second is a draft REP 147 trying to pull together a standard set of messages

I’ve done my best to survey the messages that already exist and tried to reuse existing messages whenever possible.

If you have high level discussion topics we can discuss them here. For anything specific please feel free to comment directly on the pull request.


Thanks, Tully, for your attempt to standardize interfaces for aerial vehicles!

I added some comments to the pull request directly, mainly about the addition of a velocity control layer and optional fields in pose and other commands.

Another important aspect that should IMHO be covered in the REP in one way or the other is the exact meaning of altitude/height/elevation as specified in commands and state feedback. Depending on how altitude is measured and controlled the results can be quite different if there is no standardized way to specify the altitude mode explicitly whenever a z position is involved:

The difference between exact altitudes and pressure altitudes might be out of scope in this document and the height above ground could be covered by range measurement messages like sensor_msgs/Range, but I think the question of the reference level (MSL/WGS84 vs. local reference frame) should be somehow covered, e.g. by defining different names for the map frame depending on whether its origin is at a local reference altitude (the default), MSL or WGS84.

(I had to obfuscate the links because I cannot add more than 5 in discourse).
(edited to add links in. Once your trust level goes up the link limit will increase.-- @tfoote)

1 Like

+1 for a clear, consistent definition. I recommend “meters above the WGS-84 ellipsoid”. Note that sensor_msgs/NavSatFix explicitly specifies altitude that way.

We had a similar discussion a few years back during that message review. The consensus was that we needed a consistent altitude definition. Various systems might need to convert that canonical value to and from other representations (above MSL, above ground level, different ellipsoids, etc.), but it would just be too confusing for multiple packages and nodes to figure out which type of value the message intends.

While a different answer could be crafted specifically for aerial vehicles, that seems unwise.

1 Like

@johannesmeyer @joq thanks for the feedback.

As you both mentioned the definition of altitude is something that needs to be clearly defined. My initial reaction is that it’s not necessary to discuss since we’re talking about tf coordinate frames with everything measured in meters. However I think that what we’re actually talking about is more about standards about defining map frames. And possibly we could extend our effort to develop standard map frame naming conventions to convey some of this information. And probably also want to recommend a standard to use for these common global reference frames. @joq suggests picking the WGS-84 ellipsiod as the default earth referenced map zero altitude. That seems find by me, anyone else have a different suggestion?

For example when setting up map frames they could be named or labeled to clearly indicate their external references if different from the default.

For pressure altitude there’s two ways it could be approached. The simplest would be to ignore it and make your best estimate of the true altutude. However it does have a long track record as being a very good way to coordinate flying vehicles since barometers are much more precise than they are accurate. With that said and since the pressure altitude is approximately a fixed offset based on the uncertainty in the atmospheric barometric pressure, it actually fits well into the model we have developed with the map and odom frames for ground navigation. You can navigate locally in the precise frame (pressure altutude/odom) and are continuously running a background process to estimate the drift of the precise frame to the externally referenced frame (map/ground).

For the height above ground I think that’s getting into an obstacle map representation and out of scope for this discussion.

I would not recommend the WGS84 ellipsoid altitude. What is used in reality is always AMSL.

Thanks for all the feedback.

REP 105 PR

I’ve made noteable updates to the REP 105 pull request to address topics such as transitioning between maps, clarify map coordinate frame conventions and also clarify potential intermediate coordinate frames such as pressure altitude between odom and map.

These clarifications I think resolve most of the comments from reviewers and make this close to ready.

REP 147 Draft

I’ve also updated the draft of REP 147 based on comments here and inline in the pull request.

I also fleshed out the datatypes in the document for greater clarity.

I see two comments for which there’s not a clear resolution.
First is the chose of using vehicle native commands, attitude and yaw pitch roll rates, vs the very highly abstracted 1st and 2nd derivatives of position only. (comments here And slightly related to that is the choice of naming which parallels the derivatives of position but does not match the datatypes in the current proposal.

If you have thoughts please follow up here on this thread. I plan to merge the pull request still in draft form so that people can see the graphics rendered when reviewing without downloading it and generating their own content. Once we have more direction for the next steps we can reopen an new PR to modify the draft.

Is that really true? My observations are:

  1. Most GPS units seems to give the altitude above the ellipsoid and also have the geoid correction available. Since the geoid correction can vary in quality it is the ellipsoid value that is probablu most accurate.
  2. Data available about altitude specify what geoid it is using. For example KML uses EGM96. But the data available in Sweden uses RH2000 which used the geoid SWEN08_RH2000 which is not exactly the same as the one used in EGM96.

REP 147

From the feedback in the REP 147 pull request 118 we’ve iterated.

I have merged it in draft status such that people can read it more easily and especially see the diagrams easily. Please take a read through it here: REP 147 -- A Standard interface for Aerial Vehicles (

REP 105

I think we’re close to a consensus here. There’s one question about UTM representations and another about the altitude representation.

UTM connection

I’m not sure there’s a need to directly mention this connection in the REP. I suspect it will be a common use case that we implement tools to help with but I’d propose not to call it out in the REP.

Altitude representations

During the review we switched to standardize on MSL by default.

Certainly most GPS units use WGS84 as their datum as that’s the native units of the GPS system.
I found a great list of datums on NOAA’s website: NOAA/NOS's VDatum: A tutorial on datums

If you’re not using GPS however the main reference is MSL. And that is what the bodies like the FAA use for altitudes. If you don’t consider GPS to be your primary

There’s a very detailed paper about accuracy of different representations from the FAA here

There’s a good representation on page 9 of the geoid with gravitational anomolies which change MSL.

In that you can see that the differences can range for -100m to +60m.

Where the geoid represents MSL, and the ellipse is the datum like WGS84. There’s a good illustration on page 8.

With this understanding I think that using MSL will make a lot more sense for applications that are near the sea level or on the ground. It’s not very intuitive if you take the example of a boat on the ocean to display it at -100m or +60m in altitude.

And for use cases where you are not using GPS as a source and dead reckoning based on maps and a known starting position, possibly with a barometer, using your known starting altitude as a reference is much easier.

That said since we know that most vehicles will be using GPS we should make sure that converting from the WGS84 datum is easy/built in. And I think that won’t be too much of a problem as we’ll already be converting GPS’s native polar coordinates to local euclidean approximations anyway.

Thus I’d suggest we stick with using MSL as the default altitude representation.

What do you mean by MSL? Do you mean the WGS84 geoid which uses EGM96?

Can you really do the conversion from WGS84 ellipsoid to WGS84 geoid accurate and fast on computers with limited resources? What we use know for this requires us to download big datafiles to do the conversion. The conversion done by GPS units uses a small tabla stored in the GPS and is probably not as accurate.

Well, so long as it is well specified what is used and that all the drivers for sensors like GPS outputs the same unit I suppose it does not matter so much what is used.

Will the definition of GeoPoint and NavSatFix be changed then or how will that be handled?

Velocity and Acceleration Interface


Following @jack-oquin and @meyerj’s comments above I’ve revised it to use the very simple cmd_vel and a new very similar Twist for acceleration.

I would vote for using the stamped message variants for the rate and acceleration interface, too.

Rationale: Even in a real-time context and without querying tf a vehicle can check the frame_id and interpret the commands accordingly. A velocity command in the real body frame is quite unusual for multirotors because you do not want to climb or descend while flying forward only, independent of the current roll and pitch angle. You could argue that in this case the velocity command could be given in a fixed frame, but the vehicle’s yaw angle should still be respected to determine what “forward” is, without the need to update the command continuously to align with the latest yaw estimation (which might not even be available). This control mode would correspond to standard flight modes like Loiter (ArduPilot) or P-mode (DJI). Having a header frame_id would be one good way to determine the desired control mode, next to a dedicated MAV message with an enum-like field for this purpose. I don’t think a separate control mode switch interface would be wise - the command messages should have a clearly defined meaning and be self-contained (e.g. for playing back from recorded data). In Hector we used to call the frame with the same origin as base_link but roll and pitch set to zero the base_stabilized frame, but I do not really like this name and would prefer something like base_upright.

The header timestamp might be useful to measure delays in the command and control link and a vehicle should ignore commands that have a non-zero timestamp but are too old.

For fixed-wing aircraft it might be more appropriate to command the forward velocity in body frame.

+1 for using the more abstract standard messages. I think it is quite straight-forward to translate them to less abstract messages like MAVLink as advocated by @LorenzMaier, but the conversion function might already depend on the exact type of vehicle and/or auto-pilot in use.

UTM connection and map frame

In my opinion it is nice that REP 105 recommends that the map frame is aligned with north, east and mean sea level if such global reference information is available, but this should not be enforced or even identified with a UTM frame. Afaik most mapping frameworks assume that the map frame is under their control and the robot always start at the frame’s origin with z set to 0. Also visualization tools like rviz would require some usability patches if robots spawn at positions far away from the origin and the map frame would be the only appropriate drift-free reference frame. The current draft of REP-105 also states that there might be other guidelines for defining the map frame, like aligning it with structures in the environment.

What I would propose is to introduce another frame between the earth and map, which could be UTM or any other local ENU frame. It would be the task of a node like the navsat_transform_node to choose an appropriate local_enu frame and publish a static transform from earth to local_enu and a correction transform from local_enu to map (or another intermediate frame), so that the GNSS antenna’s frame is at the measured coordinates in the map. Unfortunately this transform is not fully defined by a only GNSS measurement because it does not carry absolute yaw information and the map frame could rotate freely around the GNSS antenna. In order to solve this we would need yet another frame, e.g. map_enu which shares the origin with the map frame and enforces ENU alignment:

  -- (static) --> /local_enu (could be UTM)
  -- (from GNSS and/or barometric altitude measurement of /base_link or one of its children) --> /map_enu
  -- (from magnetic heading of base_link or one of its children - provides yaw correction only) --> /map
  -- (from mapping) --> /odom
  -- (from odometry) --> /base_link

Some of those frames could be identified if the respective information are unknown, e.g. /map_enu and /map if the vehicle has no absolute orientation information. In this case, viewed from a earth-fixed frame like /earth or /local_enu, the map frame would rotate around the robot while the robot is moving in the map frame and the GNSS node tries to keep it at the measured GNSS position. It might also be possible to come up with a node that estimates the fixed /map_enu to /map rotation from the robot motion and its location in an earth-fixed frame without the need for a calibrated magnetometer.

Altitude representation

I also share the concerns of using AMSL as an altitude reference and tended to vote for the WGS84 geoid, mainly for the reasons already mentioned above:

  • consistency with sensor_msgs/NavSatFix (@joq)
  • eventual problems with poor quality of built-in geoid correction in GNSS receivers and/or costly calculations (@Tommy_Persson)
  • No other sensor can directly measure the altitude in MSL. Barometric altimeters need an external pressure reference anyway, which could either be the local QNH (referenced to MSL), the known elevation at startup (MSL or WGS84) or GNSS (MSL or WGS84). The altitude representation is simply determined by what is used as a reference. A fusion algorithm could take care of the geoid correction and publish the result in WGS84.

On the other hand compatibility with MAVLink (see mavlink/mavlink#298) is also a good argument, but in this case it should be considered to change the definition of the sensor_msgs/NavSatFix message, too. Is this possible without breaking the MD5 sum? This would enforce all GNSS drivers to either publish AMSL altitude as given by the receiver or do the geoid correction themselves. I think the current implementation of nmea_navsat_driver is handling altitude incorrectly because it adds the geoid height (field 11 of the NMEA GGA message) to the altitude already converted to AMSL (field 9 of the GGA message) according to the NMEA specs found at

I think @tfoote’s boat on the ocean visualization example is not relevant here because even MSL is still a theoretical reference. It refers to the mean sea level and does not take into account tidal changes. The /map or the /map_enu frame as proposed in the previous section would solve this problem by providing a local earth-fixed reference frame which can be at any desirable altitude, depending on whether the /base_link has a non-zero altitude relative to /map or not. Only the /local_enu to /base_link transform would contain the absolute altitude relative to either WGS84 or MSL.

Hmm, probably there is no right and wrong answer to this question. +1 for MSL.

Will the definition of GeoPoint and NavSatFix be changed then or how will that be handled?

Anyone can open a review to change them, if desired.

I personally will oppose that proposal. Those definitions have been in general use for years without complaint, until now. They were agreed to after more review than this discussion (so far).

Changing the meanings of the data could break existing systems in subtle and undetectable ways. If we really want different semantics, we should define new messages, provide tools to translate back and forth, and then convince the ROS community to adopt them.

Sorry I misinterpreted your previous comment to be suggesting the WGS84 Ellipsoid. With some more reading I think that the WGS84 geoid using EGM1996 is our best standard to approximate MSL.

The best reference for EGM1996 I found: has a table of corrections and a short reference fortran implementation for interpreting them. That could easily be converted or wrapped to make available for easy use.

For the computational complexity there’s 15 minute grids which for the whole world are about 9MB when uncompressed and expressed as ascii in 64 thousand lines. With just text compression it drops to 5MB I think with binary representation it would drop significantly. I see what looks to be a binary format which is only 2MB.

I’ll update the REP to call out EGM1996 as the reference altitude. And we should make sure to import the reference conversion methods.

Yeah, the header is valuable for being self contained, and does give a lot of flexibility. If the program is unable to do the transform it can reject/skip the goal. And hopefuly have good fallback behavior. The nav stack has had this sort of behavior where it would not accept a goal in the robot frame as it would never be achievable. (Like a rabbit with a carrot tied to a stick on it’s back.) In this case we could have different semantics.

I agree that having these intermediate frames are likely going to be valuable in usage. But I’ve shied away from making them part of the standard and just called out that there may be intermediate frames that are implementation specific, in the paragraph Extra Intermediate Frames Since there are so many potential intermediate approaches I’d like to suggest defering trying to standardize that until we have more experience.

For this I agree with Jack that we shouldn’t change these. In particular they are in the lat/lon/altutude and we should make sure that we have good standard conversions going from that into our Euclidean map frame. But if you’re working in pure GPS data avoiding the overhead of the corrections, raised as a concern above, is valuable. The WGS84 ellipsoid is the most native datatype for a GPS fix.

Edit: REP 105 update: new commit in the ongoing PR
REP 147 update: new pull request

I think you meant WGS84 ellipsoid here.

I was actually suggesting to use the WGS84 Ellipsoid since that is what is used in NavSatFix and GeoPoint.

In our UAV system I have gone back and forth between what to use. First I used the WGS84 Geoid as the altitude if nothing else was specified. But then I started to use GeoPoint and then I switched to WGS84 Ellipsoid as the default thing. And I though that worked better.

Also MavLink I thinks is outputting the wrong altitude in the NavSatFix message. So I am not sure that the documented semantic of the NavSatFix message is respected everywhere,

For applications with multiple UAVs that need to tell each other about their position we have standardized on using GeoPoint and GeoPose.

I am sure there are several examples that do not follow the standard. I even know of one device that outputs NAD-83 when receiving a differential correction, but switches to WGS-84 otherwise. That kind of complexity is better left to the device driver so other downstream ROS components can ignore it.

So, the driver you mention probably has a bug which should be reported as an issue. For many applications those differences are not that significant, but eventually someone will notice and care about it.

That is the why we chose to use a minimal number of representations for those messages. They are not always optimal, but they make sharing between systems and components much cleaner.

I am not sufficiently experienced with aerial vehicles to understand the reasons for picking that over the simpler WGS-84 ellipsoid.

So, please add a rationale, plus an explanation of the difference between ellipsoids and geoids (many readers probably won’t understand that distinction).

Ok, I guess we’re kind of trying to determine whether the slightly more complex default is better and whether it will lead to easier to understand frames.

However thinking about it now basically this default is going to be used in the case that there’s no prior data, and thus the simpler version (Ellipsiod) makes sense. Picking the simpler solution will mean that more applications will choose to use a custom zero level appropriate for their application. Whether it be sea level, or most likely ground level at their starting point home base.

As this is just a decision for the default w/o any other data then sticking to the simplest solution seems better actually. Does this logic make sense to everyone else? Assuming it does I can update the draft again as such.

We’re kind of deep ending on the map frames. Is there any other topics that people want to raise about the proposals?

I am not sufficiently experienced with aerial vehicles to understand the reasons for picking that over the simpler WGS-84 ellipsoid.
So, please add a rationale, plus an explanation of the difference between ellipsoids and geoids (many readers probably won’t understand that distinction).

I would not call myself an expert, but I will try:

  • The WGS-84 ellipsoid is mainly used as a reference in satellite navigation systems because it simplifies the description of the satellites’ orbits and the trilateration process in the receiver.
  • The height above the WGS84/EGM96 geoid, or MSL, is used for practically all other use cases:
  • Elevation information in printed maps or Digital Elevation Models
  • Services like the Google Maps Elevation API or Google Earth and related exchange formats (KML)
  • Pressure altitudes derived from barometric pressure in combination with the local QNH and the International Standard Atmosphere (altimeter)
  • Aviation data like restricted airspaces (unless specified above ground level AGL or as a flight level)
  • MAVLink, a data exchange protocol and library that is widely used by open-source and commercial autopilots and ground stations for aerial vehicles

As this is just a decision for the default w/o any other data then sticking to the simplest solution seems better actually. Does this logic make sense to everyone else? Assuming it does I can update the draft again as such.

I never had a strong opinion on that point. In agree that most applications will probably not care about the difference and define their own custom zero level. But ellipsoid (WGS84) is not necessarily simpler in the sense of being more carefree or less computationally expensive. It depends on the source of the data and with what other altitude sources the data is linked to or fused with.

Perhaps a recommendation to use MSL according to the EGM96 geoid as a reference whenever applicable and the requirement to clearly specify the reference (like in sensors_msgs/NavSatFix) whenever a node publishes absolute altitude is enough for REP-105? Subscribers who care about the difference should provide a parameter to specify whether the altitude input is relative to the geoid or the ellipsoid and do the conversion themselves (or by using a library like GDAL or GeographicLib). Documentation and raising awareness is probably more important here than full interoperability and to enforce an eventually costly and complex conversion at the publisher’s side. A future extended version of the NavSatFix message could contain the geoid height as an optional field, following the pattern of the NMEA GGA message, which simplifies the conversion at the subscriber’s side if the source (GNSS receiver) provides both.

We’re kind of deep ending on the map frames. Is there any other topics that people want to raise about the proposals?

I am happy with the current draft or REP-105, that leaves it open whether the map frame itself is directly geo-referenced and/or aligned or whether some other intermediate frame(s) between earth and map, so that map would be only indirectly geo-referenced if the transformation is known. Or is “Map Conventions” meant as a subsection specific for the map frame?

In the near future it is probably unrealistic and too disruptive to adapt common mapping/localization packages like GMapping, AMCL or hector_slam and odometry drivers in a way that they publish a map->odom->base_link transform that takes into account the geo-referenced altitude and orientation, even if it might be known from GPS, magnetometers or other sensors. As a consequence, the map frame as it is known today will stay at more or less the robot’s elevation and has an arbitrary but constant orientation, which also avoids the need to update visualization tools to cope with large vertical offsets between the map frame and the robot.

Thanks for the helpful clarifications.

When I described the WGS-84 ellipsoid as “simpler”, I meant having fewer terms in the equations. That seems helpful, especially for converting to and from ECEF (and hence the earth frame).