Proposal - New Computer Vision Message Standards

I am forwarding a comment from a colleague, Jordi Pages:

"
In general the messages included so far look pretty well IMHO.

We could point to our own vision messages in https://github.com/pal-robotics/pal_msgs so you may have some new ideas.

For example, we could check how some specific detections like faces (along with person name/ID, emotions and facial expressions), actions, planar objects or legs (even though the latter is not really vision-based detection but laser-based) can be encoded in their messages or whether some additional fields might be required.

You include in some messages an optional field of type sensor_msgs/Image I suppose for debugging, monitorization of what part of the input image has been processed or for post-processing purposes. Depending on the main purpose you might consider using instead a sensor_msgs/CompressedImage (or both) to not penalizing remote subscribers as the frequency of the topic might reduce for them. The inclusion of an image is not mandatory because if the timestamp of the Header is set equal to the timestamp of the input image a TimeSyncrhonizer can later be used to get the same information using the BoundingBox2D. But I also like to have it here to save the burden of using a TimeSyncrhonizer…

An additional field I found useful in some cases is a geometry_msgs/TransformStamped expressing the pose of the camera in the moment that the processed image was acquired.

Regards
"