ROS Resources: Documentation | Support | Discussion Forum | Service Status | Q&A

Unified ML Inference in Autoware - A proposal

Unified ML Inference in Autoware - A proposal

State of ML in Autoware

Node Model File Format Inference Engine
lidar_apollo_cnn_seg_detect Unkown caffe caffe
lidar_point_pillars PointsPillars onnx TensorRT
trafficlight_recognizer/region_tlr_mxnet Unkown MxNet MxNet
trafficlight_recognizer/region_tlr_ssd SSD-unkown caffee ssdcaffe
vision_darknet_detect Yolov2/3 darknet darknet
vision_segment_enet_detect ENet caffe caffe for ENet
vision_ssd_detect SSD caffe ssdcaffe

Problem with current approach

  • Use a varied range of formats and frameworks.
    • For deployment this presents challenges
      • Varied degree of hardware acceleration support
      • Lock into one acceleration hardware, difficult to port 7 frameworks to a different ML accelerator hardware.
  • Use of frameworks that are forks or no longer actively maintained.
    • This presents challenges in the long term support and future update of these nodes
  • Use of frameworks that are proprietary and require special licenses and sign-ups to use.
  • Documentation and trained weights lacking
    • Makes it difficult to even compile the nodes. For example.
    • Difficult to re-train the model on custom data-sets. For example.

Solution Proposal

Unify all ML workload deployment in Autoware with a single workflow and a single ML inference framework. Organize and document all pre-trained models used in Autoware in a Model Zoo.


TVM is an compiler based inference framework that compiles models into machine code ahead of time. TVM has a number of advantages:

  1. Single framework that supports virtually all model file formats.
  2. The framework supports a wide range of hardware backends.
  3. Open source governance with active contribution from many companies in the Autoware foundation.
  4. Able to achieve state-of-the-art performance figures when compared to other frameworks.
  5. Compile and optimise models ahead of time so runtime requirements are reduced.

Autoware Model Zoo

Model Zoo proposal was presented to the TSC. Here is a summary:

  • A place for organizing, documenting and sharing neural networks that are used in Autoware
  • Organize neural networks by AD tasks.
  • The model zoo would have benefits to a variety of audiences, including new users, benchmarkers,
    prototypers, and contributors.
  • The model zoo would allow us to track the provenance of models, follow the state-of-the-art, and
    provide a peer review process for bringing in new or improved models.

Unified Workflow

  1. Clone model zoo repo without the large files
  2. Pull in specific model binary from the model zoo
  3. Compile model using TVM CLI script
  4. TVM CLI tool generates config file for inference pipeline
  5. Include config file in the build process of the inference node and build the node
  6. Install all relevant files into install folder

How to get started

Arm would like to propose to modify the lidar_point_pillars node in to use TVM as a way to explore the workflow. With this learning, we hope to build an improved ML facility in as a second step.

To that end. We propose the following work to be carried out:

  1. A CLI tool for TVM (Arm)
    1. Python cli tool for compiling models with TVM
    2. Support multiple input format and output architectures
    3. Generate macro file to be used in C++ inference
    4. Contribute to Model Zoo
  2. TVM runtime library installed in the system (Arm)
    1. Add TVM as part of the build environment for Autoware
      1. TVM run time and TVM python bindings
      2. Need to build from source similar to how Eigen is added to autoware.
    2. Update wiki documentation
    3. Update docker images on docker hub
  3. Generic TVM Pipeline Utilities (Arm)
    1. A set of utility functions that will be re-used in building TVM pipelines in multiple nodes
      Similar to lanelet2 extension library
  4. Use TVM inference in lidar_point_pillars node (Arm)
  5. Contribute PointPillars to Model Zoo (Need help from TierIV)
    1. Add metadata and documentation about how to recreate, train, transfer learn and infer.
    2. Does not block the critical path but is still essential to the whole story


Arm is committed to deliver the initial contribution. We would love to collaborate with people and companies who have aspirations in this space. Please comment below with any feedback to this proposal and any ways you would like to contribute.


I’m very interest in this work and also want to contribute. we have exp. of using tvm.

about PointPillars models, did you try quantization and pruning the pretrainded model? and what’s your inference target device?

I am interested as well.

Nice to meet you. I’m yukihiro from TierIV.
I’m very interest in this work.

We recently had a proposal from TierIV for a new architecture for Autoware.
The following table shows the proposed new architecture, which is currently all unified to TensorRT. However, we are not proposing to use TensorRT for inference in the architecture proposal, so I thought I’d have to think about this a bit more.
You’re right that TensorRT is tied to NVIDIA’s license, which is against OSS policy, so I think TVM is a very good option!

Empirically lidar_apollo_instance_segmentation is more accurate than lidar_point_pillars, so what do you think about working from lidar_apollo_instance_segmentation?
Now we are working on lidar_apollo_instance_segmentation for transfer learning and fine tuning, and it would be great to be able to work with this activity.

Node File Format Inderence Engine
lidar_apollo_instance_segmentation caffe TensorRT
tensorrt_yolo3 caffe TensorRT
traffic_light_fine_detector onnx TensorRT
1 Like

Hi Cheng,

I compiled the models from with optimisation level 3. That should include pruning by default.

I have not tried quantisation but I have heard people running into problems when quantizing PointPillars. Maybe the feature extraction bit have special operators, but the detection bit should be well understood and easily quantized.

That is great. I think once we put most of the TVM infrastructure code in place, then it would be a small change to swap pointspilars for apollo_instance_segmentation.

The main reason that I chose the pointspilars node is that it is more straightforward to obtain the trained weights that TiverIV already open-sourced. And pointspilars is a well-known model architecture which a lot of online tutorials and helper code. This will enable us to get something working more quickly as a proof of concept. Then when we come to implementation of autoware.core, we can look at accuracy.

apollo_instance_segmentation, on the other hand, is slightly more opague, I cannot find too much information on it. It is great that you are working on transfer learning and fine-tuning, this knowledge would be great to contribute to model zoo, to enable other people to use it for their own data. I think we can collaborate on that first.

Hi, i am suchang from Autocore. As discussed at the meeting last week, we are now doing pointpillar model training, and we would like to provide trained models to the model zoo, related docker image,document to help autoware users with pointpillar training in the future. As you mentioned, various frameworks cost high extra effort for users to learn . So i am really looking forward to seeing the inference code using tvm. :grinning:

Hi Su Chang,

Thank you very much for the work. I think the work on training is going to be instrumental in the end to end ML story for Autoware.