We recently launched a new and improved OctoML Command Line Interface, which allows you to put a fast, trained model into production anywhere. Package your model on your local machine and get a Docker container ready for inference on any Kubernetes-based platform.
You can also upload your model to our SaaS platform for acceleration, and we’ll run it against every cutting-edge acceleration library, including Apache TVM, NVIDIA’s TensorRT, Intel's OpenVino, and ONNX Runtime’s Execution Providers, to return the fastest possible version of your model. If you don't have platform access, you can request it here:
More information on using the CLI to save on cloud costs, developer time, or application latency with a production-ready workload ready to run anywhere is available here:
Finally, we are seeking your feedback! If you fill out one of the below surveys or contact us using the messenger to tell us what you think, we'll send you some sweet OctoML swag.