6D Pose Estimators for unseen objects: seeking suggestion on a useable pipeline for moving robots

azmyin12 · October 11, 2024, 2:57pm

Dear ROS Community,

I hope this message finds you well. I am VSLAM researcher currently working on a metric-semantic SLAM framework.

I have been following the research on 6D Pose estimators for some time now but have found that majority of the frameworks does not allow or gives instructions on how to add new objects from visual data recorded by a moving robot. Please note that there is no access to 2D/3D point cloud lidar

Can you kindly a pipeline/framework that allows me to train the suggested network to work on new objects based on RGB-D images?

Thank you for your time.

With best,
Azmyin

Katherine_Scott · October 11, 2024, 5:31pm

Can you clarify what you are asking for here? I don’t understand what you mean by:

Please note that there is no access to 2D/3D point cloud lidar

If you are referencing existing work you need to provide links / references to that work.

My experience with models like this is that you need to collect and label your own dataset, and then train on it. That data collection pipeline varies widely based on the application domain. Lots of researchers use existing datasets and conveniently skip over the messy data collection problem.

azmyin12 · October 18, 2024, 7:00am

Hi,

Sorry for the late reply.

To clarify my statement regarding 2D/3D lidar, my initial approach to this problem was using a bounding box regressors like this paper by Mousavian et al.. The dataset these family of papers utilized was the KITTI 3D Object Detection dataset. However, at that time, the annotation tools required to create the GT bounding box pose required the use 3D point cloud which to my knowledge requires access to a 3D Lidar. The UGVs that we have access to does not come with such a sensor.

I then attempted to find a neural network pipeline that utilized dense pixel-level correspondence for estimating 6D object poses with/without CAD model. A promising candidate was Park et. al’s LatentFusion. But beyond LINEMOD and MOPED dataset (introduced in the same paper). But i could not find a reliable tool or instructions that explains how to add new objects that would be compliant with these formats.

I have recently found LabelImg3d that may solve the problem of annotating new dataset compliant with KITTI 3D pose estimation dataset. I will post in this thread my experience with this tool.

Tuebel · October 26, 2024, 5:51am

This topic is still very much in a research stage. You can find a good overview of the most recent and best performing methods in the BOP challenge: https://bop.felk.cvut.cz/home/

azmyin12 · October 31, 2024, 12:29pm

Hi @Tuebel sorry for the very late reply. Yes I looked into the BOP website but as @Katherine_Scott was saying, most of those SOTA methods rely on already pre-built datasets. In my case, I need to train the 6D pose estimator for my own environment.

I will write back in this post if the two tools I found out do indeed help me train a 6D pose estimator on a custom dataset.

Topic		Replies	Views
ROS ML survey ROS General wg-edgeai	3	1645	July 18, 2020
Building a Tensorflow Object Detection and Localization ROS Package Computer Vision / Perception	9	7562	October 24, 2018
[SLAM] New 3D LiDAR odometry, mapping and localization packages ROS General slam , lidar	3	2664	June 9, 2025
Combine 2D ML algorithms and 3D lidar data with Ouster Python SDK Computer Vision / Perception	4	5916	September 7, 2023
Finding objects in an unknown environment - ROS tutorial Projects	0	1096	November 14, 2018

6D Pose Estimators for unseen objects: seeking suggestion on a useable pipeline for moving robots

Related topics