Generalized framework for deploying neural nets in ROS

One thing that has been increasingly annoying is that there is insanely good academic research in neural nets but 99% of the time the code is written in the most haphazard, undeployable fashion possible. Typically you have to clone some git repo, download some zip file of weights from a Dropbox, deal with grad students having used Windows case-insensitive filenames, put JPEGs in some directory with some file naming format, and run silly shell scripts like “run_inference.sh” rather than a nicely-built, deployable class that you can pip install and import into another project.

Of course, as a PhD myself, I completely understand where the academics are coming from. Most of them are good at math, not software engineering.

For production use though, one thing I’ve been thinking about is having a more generalizable neural net packaging framework where to switch from one neural net to the other, you only have to point your execution framework at a different package, and not have to read any documentation to switch between one net and another.

I’ve created this prototype:


which implements semantic segmentation agnostic of the network and agnostic to the neural net framework used. I’ve included a couple examples that are TensorFlow-based and one example that is PyTorch-based, and the only thing you have to do to switch between them is to change a single ROS parameter.

I’d love feedback on this general concept or how we can propagate such a standard through the community. I’d also love for it to be generalized beyond semantic segmentation. Perhaps a hierarchy of classes, e.g. a NeuralNet class, a SemanticSegmentation class, ObjectDetection class, SpeechRecognition that inherit from the parent class, and so forth, where to implement a deployable, say, ObjectDetection class, all you have to do as the researcher is override the required methods, and you can do so using whatever code you wish. So for example, MaskRCNN overrides a couple of methods, inherits from InstanceSegmentation, which inherits from NeuralNet.

Thanks for sharing! Looks great from my point of view (nice replacement for cv_bridge). I was doing similar exercise multiple times (NNs in ROS) and it would be great to have a community baseline.

May be one thing. Neural networks output is not always image, usually it’s not. Not sure how to generalize though, we are using custom messages. As you mentioned, may be there should be some community supported NN classes that also will have own message formats. So far ROS community may be also provide end-to-end packages with pretrained models (e.g. based on keras-applications).

Thanks, Roma

Haha yep I had to steer clear of OpenCV since the cv2.so provided with ROS doesn’t support python3 … (can someone please do this for the next deb package release? It’s easy to compile OpenCV with both python2 and python3 bindings, the more bindings the merrier).

Yeah actually I was kind of referring to two ideas – one is a generalized neural network hierarchy, where for each class it is defined exactly what the class should output, e.g. a SemanticSegmentation class should output a uint8 image or whatever. An ObjectDetection class would have a strictly defined format which all classes claiming to inherit from it should implement. Likewise for speech or any other network.

The ROS semantic segmentation package I made was more a prototype of creating some kind of basic framework to swap between networks without needing to fiddle with grad-student-quality inference shell scripts – You can unzip a model, as if from an app store (or maybe I should make even that step not necessary – just drop a tarball and go) and the framework would know how to use them. If there was indeed a more generalized framework for all neural nets, then that package would restrict itself to classes that inherit from SemanticSegmentation.