Auto-scheduler - Next-Generation TVM

  • We have incorporated a new TVM acceleration technique, known in the Apache TVM community as Ansor or Auto-scheduler, into the OctoML Platform. This is now the default TVM technique used in the platform, replacing TVM's first-generation capability known as AutoTVM.

  • Auto-scheduler improves model performance significantly, by as much as 25-50 percent, compared to AutoTVM across a wide variety of model types and hardware targets.

  • Users may still select AutoTVM as the acceleration technique in "TVM Options", however, because there are some use cases, for example on certain model/GPU target combinations, where AutoTVM outperforms Auto-scheduler.

  • Read more about Auto-scheduler here.

Accessing Autoscheduler via the SDK

  • In the SDK, users can now select either AutoTVM or Auto-scheduler as their TVM acceleration engine.

  • If the user does not specify either the tuning option or the advanced parameters, the system will default to Auto-scheduler. If the user does not specify the tuning option, but DOES pass explicit advanced parameters (i.e. kernel_trials or early_stopping_threshold), the system will invoke AutoTVM.

  • To select Auto-scheduler use the below instructions. Note, example settings are for testing only; for best performance set trials_per_kernel=1000, early_stopping_threshold=250

octomize_workflow = model_variant.octomize( platform, tuning_options=AutoschedulerOptions( trials_per_kernel=3, early_stopping_threshold=1 ), )
  • To select AutoTVM use the below instructions. Note example settings are for testing only; for best performance, set kernel_trials=2000, early_stopping_threshold=500

octomize_workflow = model_variant.octomize( platform, tuning_options=AutoTVMOptions( kernel_trials=3, early_stopping_threshold=1 ), )
Did this answer your question?