RFC: REP-2011 Evolving Message Types

Hello everyone!

I’m excited to finally ask folks for comments on a REP that myself and several others have been working on for more than a year.
REP-2011 is all about how to handle messages (and services and actions) which change over time.

I’m sure than many of you have encountered the need change a message in your project from time to time, and it is the same in the core of ROS.
Though uncommon, changing message types can be very disruptive.
The hope is that these new tools will make it easier to handle evolving types over time, both in ROS 2 core itself and in your projects.

This REP proposes some new tools we can build to help:

  • detect when types have changed
  • empower the user to describe how to handle conversions between versions of messages
  • actually do the conversions when necessary

The REP also describes the changes we need to make to ROS 2 in order to support these new tools, including:

  • a way to track the version of messages and warn when they do not match
  • a way to get the description of a type from remote nodes
  • a way to create publishers and subscriptions (and interpret the messages you receive) with only the description of the type at run-time

Some of these features are also useful in other tools like ros2 topic or ros2 bag and others.

The REP is not completely finished yet, and is not ready for voting on, but the core ideas are developed enough to solicit feedback from the community and for us to start the reference implementation of the various parts.

So please take a look:

And leave any feedback you like, and we’ll get to it as soon as we can.

Also, we have a ROSCon talk accepted for the upcoming event in Kyoto, so watch there for more details.

Thanks to @methylDragon, @tanyouliang, and the folks at Apex.AI who have helped out on the REP!


This is super valuable! One thing that I was immediately interested in is what a ‘generic transfer function’ could realistically support - I don’t think this is in the documented. Do you have any ideas for what those could look like? Could we end up with a protobuf-like schema evolution approach?

We haven’t expanded on that because we want to work on the reference implementation for it first to see what is possible. We already do something like this with the ros1_bridge: ros1_bridge/index.rst at master · ros2/ros1_bridge · GitHub

Smoothing over subtle differences in naming and types.

Likely the things falling into the category of “generic transfer function” will include things like:

  • a field’s changed type can be implicitly converted into the target type, e.g. int32int64
  • (maybe) reordering of types, e.g. {foo: int, bar: int} -> {bar: int, foo: int} (not sure this would be safe in all cases)
  • removing a field, the target type has fewer fields (but still a subset) of the original type

That’s just off the top of my head, but presumably, anything we could conceivably do safely and automatically we will.
I think this list falls along the lines of what protobuf (for example) supports out-of-the-box when updating a proto3 file: Language Guide (proto3)  |  Protocol Buffers  |  Google Developers

I’m not familiar with any feature that’s formally called “schema evolution” in protobuf, but maybe you mean the “Updating A Message Type” rules I linked to above?
If not, please send a link, I’d be very interested to read about it.

But either way, the adaptability of types in things like protobuf and thrift would fall into the category of “features of the serialization technology that you can use and this REP shouldn’t get into the way of”, but I don’t think it’s anything we can rely on since not all implementations will support them.
I think that’s one of the main strengths of this REP is that it gives a reasonable way to change types over time which doesn’t impose lots of very specific requirements on the rmw implementation, but also should not prevent you from using things like optional fields, inheritance, etc. with things like XCDR or protobuf if that ever becomes popular.


Echo-ing @Paul_Bovbel’s reaction, I also believe this is a great addition to ROS 2 and a good REP proposal. I dropped some comments in the PR but let me make some noise around I wish we could’ve had this long ago, as it would have saved much trouble in past projects.

What comes to mind particularly are past efforts regarding interoperability within the ROS 2 message-passing infrastructure. We bumped into a significant number of these issues while developing ROS-based modular hardware parts (H-ROS) in the past, and soon realised while speaking to robot part manufacturers that each one was “slightly modifying” ROS type definitions to capture best their sensor/actuator needs or capabilities. This of course led to a lack of interoperability among similar cameras or end-effectors (to mention some).

Our answer at the time was to propose HRIM, which proposed an information model and followed an MDE-approach to generate types. Though I’d argue there’s still value in MDE for many use cases (statically generated types give you more control over various aspects including latency, conversion flaws and/or security issues, etc.), it requires (human) coordination. What’s proposed in here does not (at least not necessarily since someone else could provide the transfer functions for you). I believe this is fantastic from a community perspective, and has a higher chance to address the interoperability issues than HRIM in the past.

@mrobinson this is probably of interest for the HIWG. If I don’t recall wrong, I believe that in one of the past HIWG meetings someone suggested REP-2007 to deal with interoperability issues. I think this feature is much better suited for that, specially since it can operate also for (intra- and inter-, same network or across multiple ones) network interactions. This is all reasonably new but the way I understand it, Type Adaptation (REP-2007) is great for saving cycles for intra- and inter-process interactions today and it’s reasonably lean performance impact-wise. The way it’s proposed, Evolving Message Types (REP-2011) will be able to address these interoperability issues across all process and networking interactions however it feels that it’ll impose a heavier burden performance-wise due to all the abstractions involved. Can you comment on this @wjwwood? Are there any recommendations in here you can share now and that can be used as part of the discussions in the WG?