Combine 2D ML algorithms and 3D lidar data with Ouster Python SDK

We are continuing to improve our SDK here at Ouster to help more engineers to build and test with lidar data faster. To demonstrate its expanding capabilities, we wrote this blog post on how easy it is to use a powerful 2D computer vision algorithm (YOLOv5) on our digital lidar data to build a social distancing app.

Feel free to try other algorithms on our sample data library. If you have any feedback, drop us a line on our GitHub.

For those of you interested in ROS2 drivers for our sensors, please visit this GitHub page. Thanks!

6 Likes

Updated YOLOv5 Sample code with Google Colab: GitHub - ouster-lidar/ouster-yolov5-demo

I meet a lot of colleagues who are tricked by the Ouster pseudoimage output seeming like a 2D image. And it always takes some time to explain to them it is not an image and should not be treated as such, unless only very approximate results are wanted (which is, however, often the case):

  1. On mobile robots, when the robot rotates, the resulting “pseudoimage” does not correspond to anything physical. Each column of the scan contains some data, but if the robot was rotating, the angular distance between columns is variable, making it impossible to map the columns to real world directions (of course, it’s quite well possible with the 3D pointcloud data + IMU). And objects get quite distorted, like with a rolling shutter camera with 100 ms exposure time.
  2. Even in stationary applications, the angular resolution of the pseudoimage is limited to 360/1024 (or 360/2048) degrees, which is less than the encoder can really tell. And unfortunately, each row of the scan has an angular offset from the surrounding rows, which can be partly fixed by the provided destaggering procedure. However, these offsets are not multiples of 360/2048, so some rows get their angle of capture affected by the lower angular resolution much more than other rows.
  3. As the usual idea people have is that the lidar has 128 lasers directly under each other pointing to the same direction, they are quite surprised that wall corners and other vertical lines do look zig-zaggy in the scan, even after destaggering. This is not generally a problem of the pseudoimage representation, it affects as well 3D pointclouds. However, in the pseudoimages, it is much better visible. It’s just a consequence of the construction and the horizontal offsets between the lasers. As the offsets are not a multiple of 360/2048, two neighbor lasers will most probably not measure in coinciding world horizontal directions, so one can nearly hit the corner, while its neighbor can nearly miss it. This effect gets the worse, the closer the object is. On this year’s ICRA, I’ve talked to about 3 presenters who tried to do lidar-camera calibration on these pseudoimages and all of them were surprised straight lines actually do not look straight, even though they required them for the calibration to succeed.
  4. Standard 2D images used for ML (RGB images) do not have invalid values. But the lidar pseudoimages do have them (as well as any other depth images do). People usually just ignore them. Depending on their number, it can work well, but it can also ruin the algorithms. Or they need to retrain much more.

This is not a rant at Ouster lidars themselves - they’re great! It’s maybe just a little rant to the propagation of pseudoimages as images without mentioning these caveats.

3 Likes

In case anyone needs a visual aid for the staggered laser scan lines I found an old video I made of a OS1-64 with a NIR camera and proper exposure time / frame rate. The FoV is limited to a front facing slice and you can see the staggered layers towards each extreme of the FoV.

2 Likes

Very informative post @peci1.

Adding a visual example on the stop sign pole you can observe the jagged pattern from a LIDAR.

When performing calibration, if the scene is static (as is above) LIDAR to camera synchronization matters little. However on a robot in motion, LIDAR points need compensation for motion.

The above example illustrates this with rotation of the camera and LIDAR on a rigid assembly. Since the LIDAR captures at 10hz, and the camera at 30hz (global shutter in this case) there is movement reflected in the LIDAR points relative to the camera image (note the misalignment on the dead-end sign).

Since there are 3 camera captures to a single LIDAR capture, the optimal camera capture can be used. By selecting the camera capture where the rotation of the LIDAR overlaps with the center of the field of view of the camera, effects from motion can be reduced. Further reduction can come from visual-odometry or IMU, for objects that are static.

+1 on removing invalid values, which can cause systemic errors during training; this is visible in the above example on retro-reflective road paint with incorrect returns. Or remove problematic captures from datasets entirely with a understanding of the test hole created by the lack of data.

2 Likes