Smaller Rosbags with hashed strings

Hi,

As we know rosbag can be quite big. In many real world scenarios a considerable space is used by strings, since ros message use extensively names as identifiers (example: name of joint or frame_id).

LZ4 compression does a very good job in term of speed vs size reduction.

I wonder if raw rosbag might use hashed string to reduce the serialize size? It can be done quite easily, I think, using a library like this one https://github.com/foonathan/string_id

What do you think?

If the community consider this upgrade useful, I can try implementing it myself :smiley: and do a pull request.

Davide

Rosbags already support compression. On the command line, run rosbag compress foo.bag to try it out. The only supported compression algorithm is BZ2, which is really slow. Perhaps there are people who would appreciate having LZ4 as an alternative option?

LZ4 is already supported. It’s actually a requirement for our 3D data because BZ2 is so slow.

I would find it surprising if such a technique had a significant impact on already compressed bags. Have you done any experiments? Or do you need uncompressed bags for some reason? LZ4 seems fast enough.

yes, maybe I didn’t explain myself well.

LZ4, as @damonkohler mentioned, is already supported and works pretty well (I personally believe it should be the default one, rather then BZ4 that is indeed very slow).

My proposal is related to NOT compressed (i.e. “raw”) rosbags.

But since LZ4 is already very fast, maybe there is no need for improving the not compressed rosbag.