Unified ML Inference in Autoware - A proposal

Unified ML Inference in Autoware - A proposal

State of ML in Autoware

Node Model File Format Inference Engine
lidar_apollo_cnn_seg_detect Unkown caffe caffe
lidar_point_pillars PointsPillars onnx TensorRT
trafficlight_recognizer/region_tlr_mxnet Unkown MxNet MxNet
trafficlight_recognizer/region_tlr_ssd SSD-unkown caffee ssdcaffe
vision_darknet_detect Yolov2/3 darknet darknet
vision_segment_enet_detect ENet caffe caffe for ENet
vision_ssd_detect SSD caffe ssdcaffe

Problem with current approach

  • Use a varied range of formats and frameworks.
    • For deployment this presents challenges
      • Varied degree of hardware acceleration support
      • Lock into one acceleration hardware, difficult to port 7 frameworks to a different ML accelerator hardware.
  • Use of frameworks that are forks or no longer actively maintained.
    • This presents challenges in the long term support and future update of these nodes
  • Use of frameworks that are proprietary and require special licenses and sign-ups to use.
  • Documentation and trained weights lacking
    • Makes it difficult to even compile the nodes. For example.
    • Difficult to re-train the model on custom data-sets. For example.

Solution Proposal

Unify all ML workload deployment in Autoware with a single workflow and a single ML inference framework. Organize and document all pre-trained models used in Autoware in a Model Zoo.

TVM

TVM is an compiler based inference framework that compiles models into machine code ahead of time. TVM has a number of advantages:

  1. Single framework that supports virtually all model file formats.
  2. The framework supports a wide range of hardware backends.
  3. Open source governance with active contribution from many companies in the Autoware foundation.
  4. Able to achieve state-of-the-art performance figures when compared to other frameworks.
  5. Compile and optimise models ahead of time so runtime requirements are reduced.

Autoware Model Zoo

Model Zoo proposal was presented to the TSC. Here is a summary:

  • A place for organizing, documenting and sharing neural networks that are used in Autoware
  • Organize neural networks by AD tasks.
  • The model zoo would have benefits to a variety of audiences, including new users, benchmarkers,
    prototypers, and contributors.
  • The model zoo would allow us to track the provenance of models, follow the state-of-the-art, and
    provide a peer review process for bringing in new or improved models.

Unified Workflow

  1. Clone model zoo repo without the large files
  2. Pull in specific model binary from the model zoo
  3. Compile model using TVM CLI script
  4. TVM CLI tool generates config file for inference pipeline
  5. Include config file in the build process of the inference node and build the node
  6. Install all relevant files into install folder

How to get started

Arm would like to propose to modify the lidar_point_pillars node in autoware.ai to use TVM as a way to explore the workflow. With this learning, we hope to build an improved ML facility in autoware.auto as a second step.

To that end. We propose the following work to be carried out:

  1. A CLI tool for TVM (Arm)
    1. Python cli tool for compiling models with TVM
    2. Support multiple input format and output architectures
    3. Generate macro file to be used in C++ inference
    4. Contribute to Model Zoo
  2. TVM runtime library installed in the system (Arm)
    1. Add TVM as part of the build environment for Autoware
      1. TVM run time and TVM python bindings
      2. Need to build from source similar to how Eigen is added to autoware.
    2. Update wiki documentation
    3. Update docker images on docker hub
  3. Generic TVM Pipeline Utilities (Arm)
    1. A set of utility functions that will be re-used in building TVM pipelines in multiple nodes
      Similar to lanelet2 extension library
  4. Use TVM inference in lidar_point_pillars node (Arm)
  5. Contribute PointPillars to Model Zoo (Need help from TierIV)
    1. Add metadata and documentation about how to recreate, train, transfer learn and infer.
    2. Does not block the critical path but is still essential to the whole story

Collaboration

Arm is committed to deliver the initial contribution. We would love to collaborate with people and companies who have aspirations in this space. Please comment below with any feedback to this proposal and any ways you would like to contribute.

15 Likes

I’m very interest in this work and also want to contribute. we have exp. of using tvm.

about PointPillars models, did you try quantization and pruning the pretrainded model? and what’s your inference target device?

I am interested as well.

Nice to meet you. I’m yukihiro from TierIV.
I’m very interest in this work.

We recently had a proposal from TierIV for a new architecture for Autoware.
The following table shows the proposed new architecture, which is currently all unified to TensorRT. However, we are not proposing to use TensorRT for inference in the architecture proposal, so I thought I’d have to think about this a bit more.
You’re right that TensorRT is tied to NVIDIA’s license, which is against OSS policy, so I think TVM is a very good option!

Empirically lidar_apollo_instance_segmentation is more accurate than lidar_point_pillars, so what do you think about working from lidar_apollo_instance_segmentation?
Now we are working on lidar_apollo_instance_segmentation for transfer learning and fine tuning, and it would be great to be able to work with this activity.

Node File Format Inderence Engine
lidar_apollo_instance_segmentation caffe TensorRT
tensorrt_yolo3 caffe TensorRT
traffic_light_fine_detector onnx TensorRT
2 Likes

Hi Cheng,

I compiled the models from https://github.com/k0suke-murakami/kitti_pretrained_point_pillars with optimisation level 3. That should include pruning by default.

I have not tried quantisation but I have heard people running into problems when quantizing PointPillars. Maybe the feature extraction bit have special operators, but the detection bit should be well understood and easily quantized.

That is great. I think once we put most of the TVM infrastructure code in place, then it would be a small change to swap pointspilars for apollo_instance_segmentation.

The main reason that I chose the pointspilars node is that it is more straightforward to obtain the trained weights that TiverIV already open-sourced. And pointspilars is a well-known model architecture which a lot of online tutorials and helper code. This will enable us to get something working more quickly as a proof of concept. Then when we come to implementation of autoware.core, we can look at accuracy.

apollo_instance_segmentation, on the other hand, is slightly more opague, I cannot find too much information on it. It is great that you are working on transfer learning and fine-tuning, this knowledge would be great to contribute to model zoo, to enable other people to use it for their own data. I think we can collaborate on that first.

Hi, i am suchang from Autocore. As discussed at the meeting last week, we are now doing pointpillar model training, and we would like to provide trained models to the model zoo, related docker image,document to help autoware users with pointpillar training in the future. As you mentioned, various frameworks cost high extra effort for users to learn . So i am really looking forward to seeing the inference code using tvm. :grinning:

Hi Su Chang,

Thank you very much for the work. I think the work on training is going to be instrumental in the end to end ML story for Autoware.

1 Like

I have just pushed a initial commit to set up the basic structure for Autoware Model Zoo. I will be followed by a series of pull-requests that implements the unified inference workflow. If you are interested please take a look.

@joespeed @cheng.chen @yukkysaito @chang.su @YangZ

Dear @LiyouZhou, this is Arun from Tata Consulting Services. We have recently joined the Autoware Foundation. I am leading one of the services for TCS in the area of Data Annotation and I would like to contribute to Unified ML Inference that you have proposed. I would like to get more information on this and whats they best way to get started.

Hi @ArunPrasad, feel free to send me an email at liyou.zhou@arm.com. we can set up a meeting to have a quick chat. I will deliver a workshop at https://www.autoware.org/iv2020 to introduce this. You are very welcome to attend.

Hi @LiyouZhou!
I’m a MLOps engineer in TierIV and new to Autoware.
I am very interested in this unified inference and TVM!
To catch up, how are the models trained and built in the current Autoware?
Is there a training platform deployed or are the models trained on some private GPU server?

Currently, there isn’t a unified story for training models in Autoware. Everything is trained offline and in different ways.

We are looking for some help in building some single-click training docker images and contribute to model zoo. If you are interested.

Hi, I’m from Tier4 percption team.
Same as https://github.com/autowarefoundation/modelzoo/pull/3 's conversation, but we tried implementing the tvm version of lidar obstacle detection. https://github.com/tier4/lidar_instance_segmentation_tvm
We would appreciate it if you could give us a feedback.

The model is trained with pytorch using nuscenes data and then converted to tvm.


I’m also interested in a unified story for training models