Blog — Advancing Analytics

Deploy models (almost) anywhere with ONNX.ai — Advancing Analytics

Written by Terry McCann | Jul 20, 2020 11:00:00 PM

We often talk about productionisation of Machine Learning being the hardest problem in Data Science (so much so, we made a podcast about it). Now times are changing, what was once difficult is being made simpler and more concise. One approach is Machine Learning runtimes.

If you look at the multitude of different ways that you can build a machine learning model today, you would be right to feel overwhelmed. You can build models in R, Python, ML.net etc. Once you have decided which language to build in, the range of different frameworks only adds to the problem. You could build a model in XGBoost, SkLearn, Tensorflow, PyTorch, Theano, CNTK, Spark or another. Now being able to deploy to each one of those environments means that you need to be deploying in the format which is required for that model flavour/framework. Which unfortunately typically means you are tightly coupled to a single environment or you have to have multiple environments running together, and then you handle the awkward interoperability between your application code and your Machine Learning code.

This is where a runtime can help. A Runtime is an environment which runs in multiple languages. You build in the language which is best for your problem and then save the model in the runtime format.  We have seen runtimes before, Mleap is a runtime which works for Apache Spark, SKLearn and Tensorflow. Mleap has had a pretty good reception across the industry and is widely used. The problem with MLeap, is that it does not support other deep learning frameworks.

Let me introduce you to ONNX. ONNX or the Open Neural Network eXchange is a runtime which can take a model, that you have trained in PyTorch or Tensorflow and encapsulate it in an ONNX format which is executed on something running the ONNX runtime. This new model can be trained in Python and deployed on an ML.net application, with no need for integration coding.

We have spent a huge amount of time creating different Docker containers for different types of models, the Tensorflow container or the PyTorch container or a container running in our model in Spark, the list goes on. That way of working is kind of becoming defunct. ONNX really breaks that down into a simple standard runtime that you can start working with and you can deploy your model into multiple different environments and ensure that is runs on your database, on your website, on your mobile device and also at the edge.

Below is a video which walks you through the end-to-end deployment of an ONNX model from Python to ml.net.