Image_transport_plugins vs hardware accelerated H.264 de|compression

Data driven development of perception functions requires the capture of sensor data on robot, for processing offline in DNN training, testing of DNN’s and other CV functions. Sustained recording of high frame rate sensors has high data rates which often exceeds the write speeds to storage, even when using a faster interface such as M.2 SSD on NvME.

Compression of camera data can significantly reduce the data throughput needed and data footprint at rest by >=10x. Existing implementations use image_transport_plugins to compress the data on the CPU, but have the performance limitations, limiting the frames per second that can be compressed.

We aim to offload the CPU from this high compute task, and leverage existing hardware common in high performance compute platforms.

There is a limitation in the design of image_transport_plugins. The plugins assume the image to be compressed resides in CPU memory. Prior to the addition of Type Adaptation (REP-2007) this would be true, however with hardware acceleration for camera processing, image data for compression can reside locally to the compression hardware, avoiding CPU copies entirely.

To solve this, we have provided isaac_ros_compression package providing h264_encoder and h264_decoder nodes, which provide the same function as image_transport_plugins and take advantage of type adaptation to improve performance. The result is that camera data can remain in hardware accelerated memory from capture through compression, where the encoded bitstream is available on the CPU for writing to disk. Conversely when decoding a compressed bitstream the decompressed image resides local to hardware acceleration memory for processing. Both the image_compression, and image_decompression nodes are compatible with existing CPU nodes, using type adaptation.

Package ROS software (CPU)
Humble Jetson Orin
Isaac ROS
Humble Jetson AGX Orin
Isaac ROS DP2
Humble Jetson Orin Nano (8GB)
Isaac ROS DP2
Humble RTX3060TI + core i7 11th gen
Image compression
H.264 I-Frame only(1080p)
20fps
49ms
170fps
17.4ms
N/A N/A
Image decompression
H.264 I-Frame only(1080p)
102fps(core i7)
29ms
N/A N/A 400fps
2.3ms

Hardware acceleration provides significant improvements in compression and decompression rates with lower latency than what is provided by image_transport_plugins running on the CPU.

We are interested in receiving feedback if this solution is good for developers? Or does image_transport_plugings need to be redesigned in ROS 2 to work with type adaptation?

Happy Halloween :jack_o_lantern: :ghost:

2 Likes

I wouldn’t use H.264 to record a ML training dataset. The artifacts in the encoded video are worse in some sense than JPEG artifacts. Did you also consider utilizing the JPEG accelerators? (either on CPUs or on GPUs?)

A great comparison is lossless versus lossy. Does lossy compression impact perception accuracy when trained on lossy with inference on not-lossy?

We’ve collected 10’s of petabytes of raw camera data worldwide using lossless video compression in our DNN development for AV perception functions where we need close to 24bits of dynamic range in a single image. This data was used to perform analysis on training DNN’s with lossy and lossless video compression of the same datasets, with inference testing on lossless data and concluded that the impact from high quality, high bitrate lossy video compression was negligible in our perception results.

From this work, data campaigns were moved to primarily use lossy video compression, with occasional lossless capture for ongoing comparison. Net is the image quality loss from H.264 is not impactful for our functional safety DNN perception functions, with substantial cost savings in terms of transfer speed, and data lake storage costs.

The H.264 compression provided is high bitrate, high quality, I-frame only which is very similar to JPG.

No, we have not that I’m aware of.

Are there studies that show the benefit of JPEG vs H.264 with similar bitrates?

JPEG compression with the right tuning may be better than H264, but I don’t have evidence of this. H.264 is sufficient for our needs. We use H.264 compression for data collection in development of our own DNN’s for robotics.

CPU’s for our sensor configurations cannot sustain the data rates needed for real-time capture, hence we depend on hardware acceleration.

We are open to provide JPEG HW acceleration for image compression and decompression if there is demand. H.264 is provided as we depend on it for our own DNN training with real-data, in addition to synthetic data.

Thanks

1 Like

Ah, okay, now I get it. So it’s basically something like H.264-based HEIF? In such case, I can imagine the effects on ML can be similar as JPEG - negligible (also confirmed by our observations). I had on mind the case with interpolated/predicted frames, where I know (again, just vaguely talking) that the quality can rapidly go down compared to the I-frames.

The case with JPEG HW acceleration would just make it possible to be closer to the easy way people are used to, as JPEG is the default supported image transport in ROS.

1 Like

We will scope adding JPEG hardware acceleration for compatibilty.

Open question remains, image_transport_plugins are not compatible with type adaptation. To provide efficient JPEG, or H.264 de|compression hardware acceleration we provide a node to perform the work in place of a plugin.

Does ROS 2 need be updated to support type adaptation with image_transport_plugins?

Thanks

2 Likes