ROS Resources: Documentation | Support | Discussion Forum | Service Status | Q&A answers.ros.org

[Nav2] A Comparison of Modern General-Purpose Visual SLAM Approaches

Hi all,

Its your friendly neighborhood navigator here. I wanted to tell you about some work spearheaded by my colleague @amerzlyakov in doing comparisons of modern VSLAM techniques for the purposes of service robotics. This paper was accepted to IROS 2021, but you can find our pre-print version on Arxiv.

If you’re doing VSLAM work, or interested in it, please take a look and always happy if you cite it if you find this analysis useful :wink: . We conclude that of all the modern and openly available VSLAM techniques available, OpenVSLAM is the overall best option based on general service robot requirements. If you have hard time synchronized IMUs, ORB-SLAM3 is worth also evaluating, but even in that situation, OpenVSLAM had additional robustness in edge cases that you’d more commonly find in an industrial setting.

VSLAM still has some ways to go. Few studies have really gone into long-term deployment to analyze their stability with changes over weeks, months, or years. That would be the best major hurtle to removing LIDARs from our robots – or reducing them into safety sensors with significantly reduced range and cost. OpenVSLAM may be able to handle such situations, but that is left to future work.

There are on-going efforts in the Nav2 working group to develop a functional VSLAM technique for mobile robot navigation and tight integration with the ROS Nav2 stack. This would not replace support for 2D SLAM in Nav2, but it would be offered in addition to, with equal support and reliability. If this interests you, please reach out. We are working on this as a cross-collaboration between Samsung Research and LP-Research. We could always use another set of hands or some smart minds interested in making this technology “really work” today.

Happy SLAMing,

Steve

21 Likes

Thanks for the paper! I’m looking forward to read it in-depth soon! I’ve noticed you were comparing SVO, just last week SVO 2.0 was publicly released I think: GitHub - uzh-rpg/rpg_svo_pro_open. According to the readme it fuses IMU now and has loop closure. I wonder how it would compare now.

Very nice work! If you ever decide to broaden the evaluation to more systems, make sure to include VINS-Fusion. Its predecessor VINS-Mono did very well in a benchmark of 7 Visual SLAM approaches, and in my personal experience it’s pretty easy to set up and “just works” and autocalibrates if you have a less-than-ideal sensor setup (rolling shutter cameras, out-of-focus/blurry images, non-perfect IMU-camera / camera-camera / IMU noise calibration, non-fisheye lenses), where other algorithms fail.

Most of the standard Visual SLAM datasets (EuRoC MAV, KITTI) don’t display any of these problems, so it’s worthwhile to do some comparisons on your own hardware.

If you look at the table, the loop closure is one of a few features SVO was missing to be considered for direct replacement of existing reliable lidar slam systems (e.g. pure localization is the big one missing but also not having RGBD support a bit annoying – though much more readily solvable)

If you look in the document, we have a nod to that. It does meet most every need, but it under-performs another method that was benchmarked. If that was not corroborated by other studies, we would have also included that in our analysis. VINS was not overlooked :wink:

Some attention should be paid to VINS-Fusion, which met all of these
requirements - and more - except for support for RGB-D
sensors. While it would have been possible to compare this
technique with the limited sensor support that it contains,
it has already been shown to under-perform ORB-SLAM3
in several domains of interest in this study [2]. Thus, this
method is not chosen for comparison.

Hi All! From the announcement above it might look like I’ve done this alone. But for fairness it is worth noting that without Steve’s great contribution in this paper, it would never have had such a consequent and deep analysis and it would not look so attracting (on my mind) like today. Thanks for the productive co-operative work!

1 Like

Yes, I’ve seen that, but I didn’t want to quote your own paper back to you. I wasn’t criticizing the paper by the way - VINS-Fusion has already been benchmarked on those datasets, so there’s no sense in repeating the experiment.

All I’m saying is if you’re going to get some hands-on experience with some of those systems for practical use in Nav2, consider giving VINS-Fusion a try. Purely based on my own anecdotal evidence, in a very specific use case that I had, it was the only one that worked, when DSO, ROVIO, ORB-SLAM, and ORB-SLAM2 all failed.

1 Like

Thanks for the clarification!