Skip to content

Latest commit

 

History

History
162 lines (126 loc) · 6.48 KB

File metadata and controls

162 lines (126 loc) · 6.48 KB

Model Config Search

Model Analyzer's profile subcommand supports automatic and manual sweeping through different configurations for Triton models.

Automatic Configuration Search

Automatic configuration search is the default behavior when running Model Analyzer. This mode is enabled when there is not any parameter specified for the model_config_parameters section of the Model Analyzer Config. The parameter that is automatically searched is instance_group and dynamic_batching will be enabled.

An example model analyzer config that performs automatic config search looks like below:

model_repository: /path/to/model/repository/

profile_models:
  - model_1
  - model_2

In the default mode, automatic config search will try values 1 until 1024 for concurrency increased exponentially (i.e. 1, 2, 4, 8, ...). Maximum value can be configured using the run_config_search_max_concurrency key in the Model Analyzer Config. For instance_group, Model Analyzer tries values from 1 to 5. This value can be changed using the run_config_search_max_instance_count key in the Model Analyzer Config. For dynamic_batching settings, Model Analyzer always enables dynamic batching during automatic configuration search.

An example config that limits the search space used by Model Analyzer is described below:

model_repository: /path/to/model/repository/

run_config_search_max_instance_count: 3
run_config_search_max_concurrency: 8
profile_models:
  - model_1
  - model_2

If either concurrency or model_config_parameters is specified for one of the models, it will disable the automatic config search for the parameter provided.

For example, the config specified below will only automatically sweep through the model_config_parameters that was described above:

model_repository: /path/to/model/repository/

profile_models:
  model_1:
    concurrency: 1,2,3,128

The config described below will only sweep through different values for concurrency:

model_repository: /path/to/model/repository/

profile_models:
  model_1:
    model_config_parameters:
        instance_group:
        -
            kind: KIND_GPU
            count: [1, 2]

If both concurrency and model_config_parameters are specified, automatic config search will be disabled.

Important Note about Remote Mode

In the remote mode, model_config_parameters are always ignored because Model Analyzer has no way of accessing the model repository of the remote Triton Server. In this mode, only concurrency values can be swept.

Manual Configuration Search

In addition to the automatic config search, Model Analyzer supports a manual config search mode. To enable this mode, --run-config-search-disable flag should be provided in the CLI or run_config_search_disable: True in the Model Analyzer Config.

In this mode, values for both concurrency and model_config_parameters needs to be specified. If no value for concurrency is specified, the default value, 1, will be used. This mode in comparison to the automatic mode, is not limited to dynamic_batching and instance_count config parameters. Using manual config search, you can create custom sweeps for every parameter that can be specified in the model configuration. Model Analyzer only checks the syntax of the model_config_parameters that is specified and cannot guarantee that the configuration that is generated is loadable by Triton.

An example Model Analyzer Config that performs manual sweeping looks like below:

model_repository: /path/to/model/repository/

run_config_search_disable: True
profile_models:
  model_1:
    model_config_parameters:
        max_batch_size: [6, 8]
        dynamic_batching:
            max_queue_delay_microseconds: [200, 300]
        instance_group:
        -
            kind: KIND_GPU
            count: [1, 2]

In this mode, Model Analyzer can sweep through every Triton model configuration parameter available. For a complete list of parameters allowed under model_config_parameters, refer to the Triton Model Configuration. It is your responsibility to make sure that the sweep configuration specified works with your model. For example, in the above config, if we change [6, 8] as the range for the max_batch_size to [1], it will no longer be a valid Triton Model Configuration.

The configuration sweep described above, will sweep through 8 configs = (2 max_batch_size) * (2 max_queue_delay_microseconds) * (2 instance_group) values.

Examples of Additional Model Config Parameters

As mentioned in the previous section, manual configuration search allows you to sweep on every parameter that can be specified in Triton model configuration. In this section, we describe some of the parameters that might be of interest for manual sweep: