The OctoML Platform currently offers packaging for deployment and acceleration for the following hardware architectures**:
Intel Broadwell
Intel Cascade Lake
Intel Skylake
Intel Ice Lake
AMD EPYC Milan
AMD EPYC Rome
NVIDIA Tesla V100
NVIDIA Tesla K80
NVIDIA Tesla T4
Arm Graviton2
Arm Graviton3
** Some users also have access to private hardware targets not listed here.
For server-based targets, users can select from the instance types offered by Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Provider (GCP), including various core counts for CPUs.
Some GCP instances display in the product multiple times because GCP allows users to specify a specific architecture within that instance (for example, n1-standard-4 may have a Skylake CPU or T4 GPU attached to it). These are listed separately in the hardware menu along with their associated architecture.
Not all instances available in the market are shown here, in part because public cloud providers do not make all instances available with a specific CPU architecture. In these cases, it is possible that a model accelerated on one architecture (e.g. Cascade Lake) could deployed or benchmarked on a different architecture (e.g. Skylake) even though the instance name has not changed. For this reason, we do not make instances with multiple possible architectures available to users.
AWS Instances (Detailed specs here) | GCP Instances (Detailed specs here) | Azure Instances (Detailed specs here) |
c5n.xlarge c5n.2xlarge c5n.9xlarge c5n.18xlarge c5.12xlarge c5.24xlarge c6i.xlarge* c6i.2xlarge* c6i.4xlarge* c6i.8xlarge* c6i.12xlarge* c6i.16xlarge* c6i.24xlarge* m6i.xlarge m6i.2xlarge m6i.4xlarge m6i.8xlarge m6i.12xlarge m6i.16xlarge m6i.24xlarge m6i.32xlarge m6g.xlarge m6g.4xlarge m6g.8xlarge m6g.12xlarge m6g.16xlarge t4g.large t4g.xlarge t4g.2xlarge p2.xlarge g4dn.xlarge p3.2xlarge g5.xlarge* r6a.large r6a.xlarge r6a.2xlarge r6a.4xlarge r6a.8xlarge r6a.12xlarge r6a.16xlarge r6a.24xlarge r6a.32xlarge r6a.48xlarge c7g.xlarge c7g.2xlarge c7g.4xlarge c7g.8xlarge c7g.16xlarge g4dn.2xlarge* g4dn.4xlarge* g4dn.8xlarge* g4dn.12xlarge* g4dn.16xlarge* g5.2xlarge* g5.4xlarge* g5.8xlarge* g5.16xlarge* g5.24xlarge* g5.48xlarge* | n1-standard-2 n1-standard-4 n1-standard-8 n1-standard-16 n1-standard-32 n1-standard-2 n1-standard-4 n1-standard-8 n1-standard-16 n1-standard-32 n2-standard-4 n2-standard-4 n2-standard-16 n2-standard-48 n2-standard-80 c2-standard-8 c2-standard-16 c2-standard-30 c2-standard-60 n2-standard-4 n2-standard-16 n2-standard-48 n2-standard-96 n2-highmem-4 n2-highmem-16 n2-highmem-48 n2-highmem-96 n2d-standard-4 n2d-standard-16 n2d-standard-48 n2d-standard-80 n2d-standard-96 c2d-standard-4 c2d-highcpu-8 c2d-highcpu-16 c2d-highcpu-32 c2d-highcpu-56 c2d-highcpu-112 c2d-highmem-2 n2d-standard-4 n2d-standard-16 n2d-standard-48 n2d-standard-96 n2d-highmem-4 n2d-highmem-16 n2d-highmem-48 n2d-highmem-96 t2d-standard-4 t2d-standard-8 t2d-standard-16 t2d-standard-32 t2d-standard-48 t2d-standard-60 n1-standard-4 n1-standard-4 custom-12-77824 n1-standard-4 custom-12-77824 | Standard_D4_v4* Standard_D8_v4* Standard_D16_v4* Standard_D32_v4* Standard_D48_v4* Standard_D64_v4* Standard_D4_v5 Standard_D8_v5 Standard_D16_v5 Standard_D32_v5 Standard_D48_v5 Standard_D64_v5 standard_d2as_v5 standard_d4as_v5 standard_d8as_v5 standard_d16as_v5 standard_d32as_v5 standard_d48as_v5 standard_d64as_v5 standard_d2ps_v5 standard_d4ps_v5 standard_d8ps_v5 standard_d16ps_v5 standard_d32ps_v5 standard_d48ps_v5 standard_d64ps_v5 standard_e20ps_v5 |
*These instances are available on an account-by-account basis.