Since ROSCon 2018 I’ve been thinking about whether it makes sense to include a common interface for visual localization that allows the perception step to be decoupled from state estimation through message abstraction. This would us to push the output from localization pipelines (feature tracking, marker tracking, ICP tracking) to a filter in a standardized way.
Here are a tentative set of messages for exchanging registration pulses and correspondences:
Registration.msg std_msgs/Header header # Camera frame/id, time at which the image was taken float64 shutter_delay # Latency between the shutter fire and registration timestamp Landmark.msg uint64 feature_id # Feature ID geometry_msgs/Vector 3d camera_coord # Position in the camera frame geometry_msgs/Vector 3d parent_coord # Position in the parent frame Landmarks.msg std_msgs/Header header # Camera frame/id, time at which the landmarks were calculated uint8 type # Camera type uint8 TYPE_RGB = 0 uint8 TYPE_DEPTH = 1 VisualLandmark landmarks
Additionally, we could add some general messages to store features and maps of features
Feature uint64 feature_id # Feature identifier geometry_msgs/Pose # Position / pose of the feature in the world frame (optional) byte descriptor # Feature descriptor FeatureMap std_msgs/Header header # Time, frame in which features are described string keypoint_algorithm # Keypoint detection algorithm string descriptor_algorithm # Descriptor algorithm string identifier Feature features # Features
I don’t know if anybody else would find this useful, or where we could put these features (perception_msgs? vision_msgs?). Comments warmly welcomed!