Welcome to the OctoML Platform! Users often find it easiest to get started in the User Interface (UI) and then move to the SDK once they have a feel for the logic of the system. Here are step-by-step instructions for packaging your first model:
Confirm we support the general model architecture of your model by checking this list.
Log in to the OctoML Platform at https://app.octoml.ai/.
If your model(s) are in TensorFlow, TFlite, or ONNX format, you may upload directly. If your model is in Pytorch format, convert it Torchscript or to ONNX format. See general tutorials here. Alternatively, select a model from our model hub at the top of the application home page.
Click "New Project", name your project, select your framework, and upload one or more model files.
Click on the model file you wish to deploy; then click "Package".
Confirm input shapes have been correctly inferred after upload. Use a third party like Netron if necessary to determine correct input shapes.
Select your desired hardware targets under "Add hardware".
Choose "Extended acceleration" if you want to wait for the system to explore all possible optimizations, potentially resulting in better performance.
Click "Package" and wait for the system to notify you when the optimization processes finish--these can take minutes to several hours depending on whether you select Extended acceleration and the size and complexity of the model.
Head back to your Project and view the results.
If you are happy with performance on any of the hardware targets, you can download the packaged model file and then pop it into your inferencing environment to test it out.