The OctoML Platform can package most models for deployment directly in their native training framework.

The Platform can also accelerate a wide range of public model architectures across computer vision and natural language processing use cases, including:

  • Object Detection and Image Classification models (eg RCNN, Yolo, SSD, MobileNet, ResNet)

  • NLP and other transformer-based models (eg BERT, GPT-2)

  • LSTM models

  • Style transfer models

  • Recommendation models (eg DLRM)

This list is not exhaustive but gives a sense of the general types of models that can be accelerated with various engines in the product across datatypes like FP32 and pre-quantized INT8 formats.

Whether a model will accelerate successfully using the TVM autotuning process or leveraging other engines like ONNX-RT, TensorRT or OpenVino typically depends on its specific operators. Proprietary models adapted from public model backbones at times may include unsupported operators.

  • See here for a list of operators supported in TVM

  • See here for a list of operators supported in ONNX-RT

  • See here for a list of operators supported in Tensor-RT

If a model you would like to accelerate has an unsupported operator (including those with convolutional dynamism), feel free to begin the process of adding an operator to TVM or contact us to have it added to our roadmap.

Did this answer your question?