In a previous post (Evaluation of robotics data recording file formats) I made the case for a new recording file format designed for robotics that takes learnings from rosbag v1/v2, the sqlite backend for rosbag2, other robotics recording formats such as px4 ulog, and media and big data container formats such as mkv, avro, hdf5, parquet.
After several months of spec discussion via ROS tooling working group meetings and development on GitHub, the beta version of the file format and language implementations in C++, Python, Go, and TypeScript are available at https://github.com/foxglove/mcap!
This is still early in the development cycle, but if you’re curious about file formats and robotics recording I encourage you to take a look. Development of the ROS2 MCAP storage plugin will take place at https://github.com/ros-tooling/rosbag2_storage_mcap and should be ready to test later this month.
Is there a (white)paper (or similar) that discusses the advantages of one log format over the other (e.g. performance, size, robustness, …)? It would be useful to know when to use which format. E.g. one format might perform better for smaller messages (joint states, transformations, …) than for large messages (images, point clouds, …).
How is this format supposed to relate to ROS2 rosbags in the final version? Will these be two entirely separate storage formats with unique tooling or will rosbag eventually become a wrapper around mcap?
@christian in my previous post (Evaluation of robotics data recording file formats) I included my writeup on various container formats and their applicability to robotics recording. It is based on features and requirements rather than performance, though. There is a benchmark suite for the C++ implementation of MCAP that is still in the early phases. The next step there is to add various mixtures of real-world robotics data that mimic different recording requirements.
@msmcconnell the ROS2 recording approach is to provide a common API with pluggable storage (serialization/deserialization) backends. The current default storage plugin is based on SQLite, and work is underway to create a storage plugin based on MCAP. There will be an evaluation period while users can opt into the MCAP backend and evaluate it. If there is enough demand for the MCAP backend over SQLite (that is the hope), it may become the default backend in the future.
I saw that document but it only gives an overview of storage formats. Once MCAP has been implemented as a rosbag2 storage plugin, it would be very useful to benchmark the read/write performance, size and robustness against other rosbag2 storage plugins (sqlite, …) and also against the good old ros1 bag format.
Eventually, the default storage format could be selected by the technical committee in the same way the default RMW implementation is chosen per ROS2 release.
We’ve formally written up a public announcement for the MCAP release below – the post also links to the most recent specs and documentation for the file format.
I find that the messaging around this has been somewhat confusing and honestly my initial reaction was similar to this xkcd comic. Reason being that some of the highlighted deficiencies of the rosbag2 + sqlite backend could be solved by an added feature rather than an entirely new format such as missing message schemas and embedding additional metadata or attachments. On top of that some of the highlighted weaknesses aren’t addressed by the new format e.g. W3C standardization
That being said, the two things that stand out to me about MCAP are that it supports seeking/streaming and promises to have better write/append performance (benchmarks required).
Some problems, like the examples you gave (message definitions, attachments), could be solved. But seeking and performance are 2 things you can’t get from sqlite over a remote connection. We’re in agreement / saying maybe the same thing on that front.
We’re not opposed to providing benchmarks, but we will say appending bytes to a file is typically faster than writing to an sql database.
We’ll continue refining this beta release over the next few months, so we’d be happy to look further at any other gaps / weaknesses in the spec that people bring up as they start playing around with it.
(And yes, we’ve gotten that xkcd comic a bunch. While there are many serialization formats, there is no real competitor for a robotics data container format apart from ROS bag files, which is limited to the ROS ecosystem and has its own flaws. We’ve encountered countless teams who just end up writing their own format, which boxes them into THEIR own ecosystem and makes it difficult to integrate with other tools. Foxglove Studio + Foxglove Data Platform are just the beginning – PlotJuggler’s Davide Faconti has mentioned that wants to support MCAP in PlotJuggler as well).
I will reiterate that MCAP is beta and completely optional. We’re making it available for people who want to test it, or people who are unhappy with SQLite recording and seeking performance. The file format is stable, but the libraries are very new.
MCAP is very similar to ROS 1 bags, with added support for CDR (DDS / ROS 2), plus support for other encodings that are commonly used outside of the ROS ecosystem (protobuf, h.264, etc).
Once the tooling has stabilized, we will work on a paper to compare and benchmark it against alternatives - please reach out if anyone is interested in collaborating. Until then, it’s just here as an option for anyone curious.