Rosbag2 has a metadata object for its bagfiles. It is by default stored as serialized YAML in its metadata.yaml file. This is fine, and fairly easy to serialize/deserialize using yaml_cpp. However, as rosbag2 evolves we would also like to add new fields and update the format or contents of existing fields within this metadata. The actual format that gets serialized to is not of that much importance - it could be JSON, XML, or pickled Python objects… whatever.
I have a number of questions I’m having a hard time answering:
what’s a manageable way to maintain parsing of this metadata, while still being able to parse older versions? The if version < 4 then use empty/default value for field logic is becoming unwieldy
is it reasonable to change the contents of existing fields? Or is this unacceptable because older versions of rosbag2 won’t be able to read it correctly (e.g. bag recorded in Humble couldn’t be played by Galactic)?
If unacceptable, could this be made acceptable by backporting parsing logic to live distros? (this can be done without breaking API since it’s all in implementation). Or, should we be considering that these metadata fields, once used, are set in stone forever to make sure all bags can be used to some extend by all versions of rosbag2?
I think it is reasonable to expect that a bag recorded with version Y of rosbag2 cannot be played on version X of rosbag2, where Y > X. I would expect my old bags to continue being playable as I upgrade ROS distributions but I wouldn’t expect the reverse.
If message schema evolution is important (like in your use case) a protocol like google protobuf is a good choice for sure. It supports up and downwards compatibility if some rules are respected when changing the message schema. Disadvantage is that it is not human readable, but at least it can be easily converted to JSON just for introspection.
I think the big argument for forwards and backwards compatibility is a strong desire for easily exchanging data. We frequently interoperate with other organizations that are running different versions of ROS 1, and it’s been very convenient that ROS 1’s bag format has been stable for so long that it makes exchanging data pretty painless. I suspect that this would also be a big benefit for the academic community.
That being said, I also recognize that it’s probably going to lead to a much more complicated parser and maintenance overhead. It would be nice if field definitions did not change, so that older versions could always correctly parse the fields they know about, even if they came from a newer version.
Edit:
Is a reasonable compromise a guarantee that all currently supported versions of ROS2 can read each others’ bags? So, Foxy and Galactic would be guaranteed to read each others’ bags, but when Foxy goes EOL eventually we wouldn’t worry about maintaining compatibility with it?