The OctoML Platform settings for the TVM acceleration engine are currently set at appropriate defaults for a typical use case, where a user would like to achieve high model performance within a reasonable timeframe. However, for certain use cases or model scenarios, users may be willing to spend more time to explore additional optimizations to achieve the best possible model performance. Occasionally, this additional effort can increase model performance above the product's defaults. Exploring these configurations is the subject of this short tutorial.

Improving Model Performance

Once the TVM has conducted graph-level optimizations—fusing operators together, unrolling loops, and managing control flow, among other activities—there are a set number of tasks for optimization. In many but not all cases, these tasks map to operators in the model.

  • For each task, TVM takes the compute operation and explores the schedule space, which represents how the computation is laid out across memory and compute resources on a given hardware target.

  • To prevent extraordinarily long tuning times, TVM uses an early stopping threshold, which simply means stopping the search when a certain number of efforts have not produced any additional performance.

  • TVM also reuses data from past model optimizations as part of its search, which reduces total tuning time and increases model performance.

Modifying the following defaults will increase tuning times and may increase model performance. These default modifications are only available via the SDK, not in the web-UI, and can be found here:

https://app.octoml.ai/docs/api/octomizer/octomizer.html#module-octomizer.model_variant

  • AutoTVM and Autoscheduler

    • TVM currently has two different search methodologies, each with their own rules for kernel generation and task allocation. As a result, users looking for maximum performance should create a workflow under each method to identify best possible performance.

  • Kernel Trials

    • This is the set number of searches the product conducts for each task (usually operator) in a model. If it can, the product reuses data from previous model optimizations to conduct this search. If there is no available past data, the product conducts a new search.

  • Exploration Trials

    • By setting this above zero, the user can force the product to conduct a new search beyond any previous model optimization data used in Kernel Trials. This new search is based on the best performance identified in past model optimizations, if there are any. Because the system does not know until optimization time whether data exists from past searches, users should think of Exploration Trials as a way to set a minimum number of new searches to run. This new search occasionally identifies additional performance improvements.

  • Early Stopping Threshold

    • For any new searches, this sets the number of searches at which the product will stop searching if it identifies no additional improvements. Setting this at higher levels, or equal to the number of Kernel or Exploration Trials, will force the product to continue searching, which can occasionally produce an additional performance increase.

  • Random Trials

    • As an additional configuration on Exploration Trials, Random Trials will select randomly from previous model optimizations to seed a new search. As with standard Exploration Trials, this new search approach occasionally identifies additional performance improvements.

Suggested Changes to Search Defaults:

  • Increase Kernel Trials (AutoTVM) or Trials Per Kernel (AutoScheduler) to 3,000

    • Rationale: The majority of tasks see no improvements after 1,000 to 2,000 trials, and even fewer see improvements after 3,000. However, users looking for maximal performance should increase Kernel Trials to ensure no additional performance benefits can be found.

  • Increase Early Stopping to 100%

    • Rationale: On certain occasions, the product will find a performance improvement after many searches with no such improvement. This will increase tuning time considerably because it applies to every tuning task, but will force the product to search all the way through the specified number of Kernel Trials.

  • Increase Exploration Trials to 3,000 (AutoTVM only)

    • Rationale: It is possible for Kernel Trials to reuse less than optimal data from previous model optimizations. Setting a high Exploration Trials target will force the product to search new schedules. It will also increase tuning time considerably.

  • Increase Random Trials to 500 (AutoTVM only)

    • Rationale: In cases where the search based on data from a previous model optimization is stuck in a local optimization, this setting provides a randomized search starting point to allow the product to find a global optimization, if it exists. This should be used in conjunction with Exploration Trials to guarantee that it will be used effectively.

Did this answer your question?