(I’m trying to phrase this not as a support request, more of an observation of a general problem.)
I have been using an Nvidia Jetson TX2 on an autonomous vehicle that uses a camera to follow a target. The benefit of the Jetson platform of course is that you have access to CUDA for acceleration.
I haven’t settled on whether I will write my own node to actually track the target or use a community-developed package, but either way the solution will probably end up using OpenCV. Already I am using the
spinnaker_sdk_camera_driver for the camera which in turn uses
cv_bridge, both of which require OpenCV.
The conundrum is how to actually make use of the GPU compute capability of the Jetson.
On the Jetson, which does not support OpenCL (as of 2019), the only option is to use the CUDA API, which is an invasive code change.
(It is a bit of an exercise even to install OpenCV with CUDA, as Nvidia’s published package does not include the CUDA module that they themselves contributed…?)
The consequence of this is that an accelerated OpenCV-based computer vision pipeline in ROS would basically require parallel implementation of several different nodes. (And the more message passing involved in the pipeline, the less you realize the benefit of acceleration.)
This problem probably generalizes to other types of accelerated computing, but it is less likely to have, say, an ML model that requires multiple nodes, so I think it especially impacts vision pipelines.
What can the community do, if anything, to support increasingly-popular accelerated computing platforms like the Jetson, which often require different APIs, without creating fragmentation?