Description
The vision_darknet_detect node currently used by Autoware to provide object detection is written
using the Darknet framework which optionally uses CUDA for acceleration.
Diminished performance is observed on drive / development platforms without CUDA support enabled,
which limits its adoption in real-time applications.
A unified vision_detector package can be written in the ArmNN framework - which accepts pre-trained
models and targets a wide variety of processors and accelerators at compile time - allowing it to take
immediate advantage of recent and future advances in compute technology.
When the proof of concept is developed and performance metrics are available, there are two possible
integration paths: (i) for the CUDA-accelerated vision_darknet_detect and the unified vision_detector
to co-exist and remain modular implementations of object detection; and (ii) for vision_detector to
absorb the darknet-CUDA backend as a compile-time option. The community can decide on a suitable
course of action at that point.
Implementation considerations
An ArmNN implementation of vision_detector would:
- Exhibit suitable characteristics for real-time applications , namely:
- low processing delay;
- minimal delay jitter;
- drop messages and timeout in a deterministic way.
- Maintain black-box, abstraction, and compatibility guarantees:
- use a subset ( contravariance ) of input topics;
- publish a superset ( covariance ) of output topics;
- present the same retry / queueing / timeout semantics.
- Be suitable for upstream adoption in production systems :
- use a framework written in C++;
- provide a variety of accelerator back-ends;
- support model export and import from a variety of training frameworks.
- Present a drop-in, reference implementation of a vision_detector node.
Alternatives
-
Develop separate accelerator nodes targeting each supported platform
A set of object detecting nodes could be developed: one for vector-instruction acceleration;
one for GPU acceleration; one for ML ASIC acceleration; etc. A unified node that uses a high-
level framework (e.g. ArmNN) and pluggable backends would reduce maintainer workload.
Additional Information
Proposed Steps
- Develop a proof-of-concept vision_detector node targeting the ArmNN reference CPU backend.
- Extend target platform support to include:
a. SIMD extensions: NEON
b. General-Purpose compute: OpenCL
c. Purpose-built accelerators - Present findings and discuss next step regarding vision_detector architecture:
a. separate nodes, or
b. unified node with build options. - Testing to ensure conflict-free operation when run as part of Autoware stack
- Integrate into upstream development branch for release
More about ArmNN
ArmNN is a portable C++ framework with pluggable backend support; targets include:
CPU (reference), NEON, OpenCL, and external accelerators (e.g. USB, PCIe solutions).
Disclaimer
I work for Arm, and it is in my interest to develop a proof-of-concept using the ArmNN framework.