First of all, I love rosbags. Recording things works mostly well and is incredibly usefull!
Fast forward to the issue (possibly I am just lacking knowledge on how to do this properly). Whenever analyzing rosbags, I seem to be writing some one time off pre-processing scripts and end up converting data into jpg / csv files. Sometimes I never get to analyze data for this additional step, and end up with a graveyard of soon-to-be-analyzed rosbags.
Again, forgive me I am naive, but why is all data stored in a single entity anyways? Would it also work to have a file per data stream, or would this cause synchronization issues?
The current project I’m working on has 100+ topics. So, sure, you could have a file per topic, but that would become a lot very quickly.
That is another issue, yes: let’s say I’m doing playback, or exploring a bag file with static code. I may want to look at two topics at the same point in time (maybe a ‘bounding boxes’ topic and a ‘labels’ topic, for example). But there’s another synchronization issue you may not be considering, which is that things in ROS don’t happen at the same time. The CSV (or dataframe) approach doesn’t really capture the fact that actually, messages are published on totally independent time scales from each other. There’s no guarantee that each topic will publish at the same time with the same rate (and in fact, they almost certainly won’t). So instead, we have a file format that simply captures each message event as they happen.
As a final thought-- it’s certainly valid to record rosbags for the sake of converting and later analysis, but if you’re doing this it’s possible you’re not making full use of the tools available to you. For example, I really like using Foxglove to explore rosbag files, since it lets you scrub through time, play and view whatever sections you’re interested in seeing, etc. I’ll also make use of utility nodes to handle a lot of the processing in real-time: for example, if I know I’m recording a rosbag so I can compare two topics, I’ll just write a quick python ROS node that does that and publishes it. That way I can capture that result with the context of the data that produced it, or I can even just write that to a CSV directly. And (as you’ve found) the python rosbag bindings are pretty handy, too.
grepros - ROS Wiki can export ROS bag data in various formats, including CSV and SQLite, and output can be filtered in many ways, including message time and contents, and detailed conditions like “read topic A only while topic B has value X”.
May be relevant to your use case. It can be used via command-line or via Python API.
wow! Thank you for all the responses. I was definitely unaware of
the post-processing capabilities of plotjuggler
the playback functionality of foxglove
Thank you for shining some light on this fact, it does make a lot of sense.
Maybe for clarification. I am mostly using rosbags to record data in frankly unstructured environments. Couple of robots / cameras / sensors + multiple teams working on different tasks + little time. The goal is not debugging, it is capturing, analysis and publication. Loads of post-processing. ROS makes the capturing happen, I can treat everyone’s stuff as black-box and rely on a clean interface, it is just the analysis I am struggling with.
I am well aware that I am blind of the complexity, but it would be awesome to have some pandas-like API, where one could vaguely synchronize streams, delete / modify, search etc. I just don’t see how this could be done on a rosbag. It goes to the point where I write utility nodes, re-play, save as something else. It just feels odd. Like why would I have to go through the DDS-layer to achieve that?
It seems one has to do the post-processing during capture. Again, my fault. Sometimes I just don’t know everything in advance. E.g. I was relying on a static camera, but then the camera started drifting. So now I have to correct for this additional transform, yadi yadi yadi. Maybe complex things are just complex
So rosbag is a storage API specific to ROS. It can record in different formats internally, at least SQLite and MCAP in ROS2. MCAP is just a binary file that contains messages, but you can store ROS1, ROS2, Protobuf and JSON messages in it. So MCAP is a bit more general than rosbag. However, if you want to use ROS-specific features such as replaying a bag in ROS, you need to use rosbag.