ROS2 Humble includes new features for hardware acceleration, as part of a joint development effort between Open Robotics and NVIDIA. Type adaptation (REP-2007) and type negotiation (REP-2009) were finished and implemented in Humble.
Type adaptation (REP-2007) is a hardware agnostic feature allowing ROS topics to be adapted to a format better suited for hardware acceleration. Type adaptation is native to ROS, compatible with existing nodes, and open to all types of hardware accelerators including GPUs, DSPs, NN accelerators, and other HW blocks.
(example graph of nodes using hardware acceleration in Foxy (top graph) compared to use of type adaption in Humble (bottom graph). Type adaptation reduces copies from CPU to GPU in a pipeline of nodes, while increasing concurrency between the CPU and GPU)
A node using an adapted type, can publish, and/or receive the adapted type. Nodes using an adapted type, need to provide functions to convert from the standard type, to the adapted type, and visa-versa. This enables a graph of nodes to use an adapted type which can improve CPU and hardware acceleration concurrency, offload the CPU from compute tasks, and eliminate memory copies between the CPU and hardware accelerator.
We are open sourcing type adaptation examples.
The Juliaset example provides a visually interesting complex number crunching function which iterates on an image.
(sample animation of Juliaset iterative function)
The simple_increment example takes an image and adds +1 to each pixel, and was used in the development of type adaptation to optimize SW overhead in rcl.cpp.
These type adaptation examples can be ported from CUDA to other hardware accelerators to start development of type adaptation for your platform. The examples are instrumented with profiling hooks for Nsight systems to measure the “before” and “after” of using type adaptation, and can be replaced with a profiler of choice.
Happy type adapting.