OctoML now supports inference optimization for classical ML models! This capability helps enterprises reduce costs and choose the most optimal hardware when deploying classical ML on the cloud.

Users who train models using sklearn (e.g. random forests, gradient boosting classifiers, KNNs, SVCs) can upload their model files to OctoML after converting the files to .onnx format. Users can then benchmark the models to determine the most ideal hardware target for their classical ML models and deploy an OctoML-optimized version of the model.

Future releases will include additional support for K-Nearest Neighbors models and other classical ML architectures, as well as additional engines to further improve inference latency.

The list of supported classical ML model architectures is here. Follow this tutorial to convert your model to ONNX format before uploading it to OctoML.

Did this answer your question?