I’m writing this topic to introduce my latest work
A ROS wrapper for Yolact that extends the already existing wrapper by utilizing a depth image to generate 3D bounding boxes and pointclouds of the detected objects.
Last week, I was working on a task to detect and localize people in 3D. I had a requirement that it should run in real-time (something like 10fps …ish). Knowing about YOLACT, I thought this would be the best solution for my case and found an already well-developed ROS wrapper for it.
depth_yolact_ros takes the detection boxes and the associated masks, crops the depth image, takes the masked pixels, converts them to a pointcloud using the camera_info, then filters the points for any mislabeled pixel in the mask using k-means clustering first, then a Gaussian model to reject outliers on the depth axis. Each detected instance runs on a thread and all results are published on a MarkerArray topic and a pointcloud topic.
There are a lot of modifications that can be done to make the package much faster. I have included a
what's next? section in the Github repo. If anyone is willing to contribute, please feel free to either start or contact if you have any questions!
Here are some demo videos:
- I would like to thank Kingdom Technologies for this task.
- If you have seen my last topic here or on LinkedIn about
swerve_steering_controller, I haven’t ignored it. I have managed to improve the quality of the odometry a little bit, and will be finishing the rostests and gtests and making a pull request soon, hopefully!