Rosbag sharing place

Hello ROS Users,

I have a small idea in mind, a rosbag sharing place.

At work, we always have a problem with choosing hardware and always save .bag for Proof of Concept development and demonstration of algorithm.
I think being able to grab .bag from hardware we don’t own to test it or experiment with other test environment or use it to create easy tutorial is pretty cool.

But this idea can be achieved only if people share it :slight_smile:

I wanted to know your thought :

  1. Is a rosbag sharing place a good idea ?
  2. Will you take/share .bag ?
  3. Do you see any problem with sharing .bag ?
  4. Do you know a good way to upload and share .bag (since they can be very large) ?

This place can be the ros wiki, a github org, etc. as long as the community can be involved.

Thanks for you time,
Best regards,

5 Likes

This talk from ROSCon 2016 about BagBunker might interest you:


The software has apparently moved to here:

2 Likes

That’s correct, thanks Geoff. Let me know if you get into problems running. It does exactly what you need and also allows you to conveniently visualize the content and search for bags.

Thanks a lot for the link, I totally missed this talk,

This software is great, it does the job I was looking for (and more).
I will begin with this .

But I’m still open for other solution if somebody has any.

Best regards,

Marv is great, but it doesn’t address your suggestion of having some infrastructure that allows ‘everyone’ to share bags - unless someone sets up a public instance with enough storage somewhere.

I’d definitely be interested in that, but that would need (quite) some coordination I believe.

1 Like

github for data.

recently picked up using open science framework for group internal logfile sharing (nothing fancy, small datasets). that should work as a public bag dump as is, even for large amounts of data.

maybe osf could be talked into running marv within their site if the repository contains .bag files, just like a wiki comes with every repo on osf or github.

osf site: https://osf.io/
example repo: https://osf.io/re5wx/

1 Like

I also believe that MARV could be a great interface to a public bag exchange. It is under active development so there are more features to come that are not in the community edition (yet). We have been using their enterprise edition in my previous job and I believe the developers might be open to supporting publicly hosted instances that make the data available to everyone.

For completeness, there is also https://github.com/swri-robotics/bag-database, which is also freely available and has a similar goal to MARV, but with (I believe) a subset of features.

Also, a bag cloud storage project BotBags was announced a while back, but except from this announcements, there was nothing much yet (Announcing BotBags, the cloud rosbag storage service).

As for a central place to share bags, we would need to ensure funding for the storage and bandwidth. It seems like OSF could be the right place for that (first time I hear about it), however we would need to talk to them. I guess it only makes sense if MARV was running on their servers as well. While in principle it could be possible to run it somewhere else and access the data via their API, it probably would be way too slow.

Also, from an initial look at their offerings, it seems that the storage is not restricted currently, but the maximum file size is (“Individual files must be 5GB or less to be uploaded to OSF Storage”), which might be a problem (although splitting bags is of course possible and supported by MARV).

If we figure out where and how to host it, I’m willing to help setting up and maintaining MARV for that use case.

Maybe we could also start by hosting a public instance at some university or OSRF with limited storage to get a feel for what would be required. But if you want to get serious, that is a huge task (backups, availability, scaling storage and bandwidth, funding, …).

I would love to have some site to share ROS bags on. At RoboCup@Home, I collect bag files from all teams in all challenges.

There is some interesting data in there but it’s a pity to only WeTransfer that data to the teams.

The OSF seems like the correct option for bags that are related to research results and papers. If they could be convinced to set up an instance of MARV, that would be awesome.

1 Like

Nice idea for OSF, it seem a right place for .bag, they also got some API to connect to external application and they can even redirect the project main page to an external website : http://help.osf.io/m/addons/l/524148-connect-add-ons#External-links

Github is not the right place for storing large dataset, each repo are limited to 1GB : https://help.github.com/articles/working-with-large-files/

Would it make sense to start out with links to files hosted in various cloud storage places (google drive, dropbox or any others that aren’t going to charge by the download)? The links could be on a page within the existing ros wiki. answers.ros.org could have requests for bags from particular sensors or robots, or conversations here on discourse could drive new uploads.

If an organization with money later wants to make sure those links don’t go dead then all the files could be moved to a centralized location (maybe still google drive or dropbox, but all in one account). Licensing of the files would ideally already be in place so they could be freely copied without issue.

I don’t know if that grows into MARV integration very well though, which sounds like it would be a substantial increase in IT maintenance and hosting fees since it wouldn’t leverage the free/low-cost but more constrained hosting services.

It would be great if every visual sensor could also have a link to a video (e.g. on youtube, though again issues with links going dead later if someone takes their youtube page down) that shows off the entirety of what is in the bag so it can be quickly previewed without downloading gigabytes and incurring bandwidth costs.

1 Like

Hi,

I wanted to let you know that I contacted Botbag, no answer yet (but It’s been only 3 day, so no problem).

I also contacted OSF, it’s ok to upload bag as long as we don’t upload all bag at once (“If uploading to OSF Storage, all we ask is that you upload your rosbag gradually and not all at once. If too much data is uploaded in a short amount of time, it will cause us problems.”).
I opened a test project : https://osf.io/7jp2y/ (I will remove it so don’t store important bag).

I created a small instance of MARV into a private server : marv.cocyb.org where I uploaded some bag for testing (and removed, only a SR300 bag for find_object_2d is available.)

There is a lot of problem with MARV at the moment :

  • We can’t remove bag (will be fixed soon)
  • We can’t upload bag directly from the website
  • We need to limit the number of data displayed (the website display all image from the image topic, it will take too much space)
  • Tags seem not to work
  • We need category for bag (Sensor/Robot/Camera/?)
  • We need a way to make MARV point to a remote storage (OSF/Drive/Dropbox/?)
  • Need to display more data from bag (video/map/tf/?)

It may look bad but it’s not a big deal. :slight_smile:

OSF have a cool API for uploading, so a MARV server can act as a proxy to take bag, analyze it, then upload it to OSF and keep only the download link (so all data will be on OSF, the server will only keep the metadata, and the server will throttle upload to OSF to avoid sending too much data).

I will continue to play a bit with Marv and OSF API. I will fork MARV someday to connect it to OSF and correct some bug, so everyone interested can participate.

2 Likes

Cool, awesome for getting starting. Your proposal might actually be a good solution, using OSF for long-term storage, but processing the data first on the server where MARV is running. We might need apply some smaller patches to make that work with MARV, but it should not be a big deal.

Most of the issues with MARV are know problems of the current Beta community edition and are already addressed with the more recent developments. The cool thing about MARV is that it is quite flexible, i.e. if we need a new property such as “category”, that can be easily added (even though, I’m not sure if that is not already covered better with tags…). Things like video is already there, and some more visualizations like a trajectory player, also embedded in a map viewer for things like NavSatFix.

Rather than forking, we should work with Ternaris (MARV developers) to make sure we don’t duplicate efforts and can directly benefit from future updates, so don’t put too much effort in that just yet. I will talk to them in how far they want to support this idea.

Well that is exactly the kind of thing that MARV is designed to do for you.

1 Like

Just a heads up, BotBags is still alive and under development. Look out for more news here very soon!

1 Like

So I talked to the guys from Ternaris and they are very positive towards this community effort and are willing to support it, at the very least with a free license for the EE features for a community hosted service.

After a couple of beta iterations they are also converging towards a stable version, which we should use as basis (instead of the current public beta from last year).

We are also currently looking into possible solutions for hosting.

More on all fronts shortly.

Hi,

and thank you very much for your thoughts on this!

We are happy to announce that https://marvhub.com is online and we are
ready to accept datasets for publication. marvhub.com is and will remain
free of charge for public datasets.

For the time being we’ll use a manual workflow and are relying on
external hosting of bag files.

To publish your datasets at marvhub, please follow these steps:

  1. Put all files of your dataset online, e.g. at OSF
  2. Create a yaml file to describe your dataset, use template
  3. Put that yaml file next to your dataset files
  4. Post a link to your dataset here

We are starting with bag files. Do you have other file formats that
you would like to publish at marvhub.com?

best regards
Florian

3 Likes

I’m getting a login page when I open marvhub. Is that intended?

yes, once we have the first dataset published, the login will be removed and all public data will then be visible without login.

Just for even more completeness,

ros.org generously hosts large files. As documented, it’s primarily for files for testing, so .bag is a perfect example. No fancy features as discussed in this thread (don’t get me wrong. I’m thumbing up) but I assume it’s safe to expect to last as long as ros.org does.

While shared bag files are useful for some people who are working on projects (i.e. temporary), another important usecase is for testing like continuous integration, which can be endless. So sustainable hosting server or features that enable server portability (e.g. persistent URL as @lucasw mentioned are very much appreciable IMO.

1 Like

Awesome, thanks Flo!

I put together an according yaml for one of our datasets that also has bag files: http://vmcremers8.informatik.tu-muenchen.de/lsd/OmniDataset/marvhub.yaml. Its not super exiting, since the bags contain only one image topic each, but it’s a start.

Immediate observations:

  • For this dataset each sequence has 2 bag files (one with the original images, one with the rectified ones). For MARV it would probably make sense to combine them, but this could also just be our task to provide a bag with both. Question is whether it makes sense for marvhub to allow combining multiple bags into one MARV “dataset” like it already does for split bags.
  • For each sequence there is additional information like text files with camera calibration and ground truth, as well as the same image data in different formats. Again one could argue, that we should provide a bag file with both calibration groundtruth as ROS topics, but I guess there will always be data that doesn’t natively fit in a bagfile. So the question is if marvhub should support additional per-bag links / files / metadata.