The OctoML Platform can package most models for deployment directly in their native training framework.
The Platform can also accelerate a wide range of public model architectures across computer vision and natural language processing use cases, including:
Object Detection and Image Classification models (eg RCNN, Yolo, SSD, MobileNet, ResNet)
NLP and other transformer-based models (eg BERT, GPT-2)
LSTM models
Style transfer models
Recommendation models (eg DLRM)
This list is not exhaustive but gives a sense of the general types of models that can be accelerated with various engines in the product across datatypes like FP32 and pre-quantized INT8 formats.
Whether a model will accelerate successfully using the TVM autotuning process or leveraging other engines like ONNX-RT, TensorRT or OpenVino typically depends on its specific operators. Proprietary models adapted from public model backbones at times may include unsupported operators.
If a model you would like to accelerate has an unsupported operator (including those with convolutional dynamism), feel free to begin the process of adding an operator to TVM or contact us to have it added to our roadmap.