How to Configure Algorithms
This page describes hyperparameter (HP) tuning algorithms that Katib supports and how to configure them.
HP Tuning Algorithms
Katib currently supports several search algorithms for NAS:
- Grid Search
- Random Search
- Bayesian Optimization
- Hyperband
- Tree of Parzen Estimators
- Multivariate TPE
- Covariance Matrix Adaptation Evolution Strategy (CMA-ES)
- Sobol Quasirandom Sequence
- Population Based Training
Grid Search
The algorithm name in Katib is grid
.
Grid sampling is useful when all variables are discrete (as opposed to continuous) and the number of possibilities is low. A grid search performs an exhaustive combinatorial search over all possibilities, making the search process extremely long even for medium sized problems.
Katib uses the Optuna optimization framework for its grid search.
Setting name | Description | Example |
---|---|---|
random_state | [int]: Set random_state to something other than None
for reproducible results. | 10 |
Random search
The algorithm name in Katib is random
.
Random sampling is an alternative to grid search and is used when the number of discrete variables to optimize is large and the time required for each evaluation is long. When all parameters are discrete, random search performs sampling without replacement. Random search is therefore the best algorithm to use when combinatorial exploration is not possible. If the number of continuous variables is high, you should use quasi random sampling instead.
Katib uses the Hyperopt, Goptuna or Optuna optimization framework for its random search.
Katib supports the following algorithm settings:
Setting name | Description | Example |
---|---|---|
random_state | [int]: Set random_state to something other than None
for reproducible results. | 10 |
Bayesian optimization
The algorithm name in Katib is bayesianoptimization
.
The Bayesian optimization method uses Gaussian process regression to model the search space. This technique calculates an estimate of the loss function and the uncertainty of that estimate at every point in the search space. The method is suitable when the number of dimensions in the search space is low. Since the method models both the expected loss and the uncertainty, the search algorithm converges in a few steps, making it a good choice when the time to complete the evaluation of a parameter configuration is long.
Katib uses the
Scikit-Optimize optimization framework
for its Bayesian search. Scikit-Optimize is also known as skopt
.
Katib supports the following algorithm settings:
Setting Name | Description | Example |
---|---|---|
base_estimator | [“GP”, “RF”, “ET”, “GBRT” or sklearn regressor, default=“GP”]:
Should inherit from sklearn.base.RegressorMixin .
The predict method should have an optional
return_std argument, which returns
std(Y | x) along with E[Y | x] . If
base_estimator is one of
[“GP”, “RF”, “ET”, “GBRT”], the system uses a default surrogate model
of the corresponding type. Learn more information in the
skopt
documentation. | GP |
n_initial_points | [int, default=10]: Number of evaluations of func with
initialization points before approximating it with
base_estimator . Points provided as x0 count
as initialization points.
If len(x0) < n_initial_points , the
system samples additional points at random. Learn more information in the
skopt
documentation. | 10 |
acq_func | [string, default="gp_hedge" ]: The function to
minimize over the posterior distribution. Learn more information in the
skopt
documentation. | gp_hedge |
acq_optimizer | [string, “sampling” or “lbfgs”, default=“auto”]: The method to
minimize the acquisition function. The system updates the fit model
with the optimal value obtained by optimizing acq_func
with acq_optimizer . Learn more information in the
skopt
documentation. | auto |
random_state | [int]: Set random_state to something other than None
for reproducible results. | 10 |
Hyperband
The algorithm name in Katib is hyperband
.
Katib supports the Hyperband optimization framework. Instead of using Bayesian optimization to select configurations, Hyperband focuses on early stopping as a strategy for optimizing resource allocation and thus for maximizing the number of configurations that it can evaluate. Hyperband also focuses on the speed of the search.
Tree of Parzen Estimators (TPE)
The algorithm name in Katib is tpe
.
Katib uses the Hyperopt, Goptuna or Optuna optimization framework for its TPE search.
This method provides a forward and reverse gradient-based search.
Katib supports the following algorithm settings:
Setting name | Description | Example |
---|---|---|
n_EI_candidates | [int]: Number of candidate samples used to calculate the expected improvement. | 25 |
random_state | [int]: Set random_state to something other than None
for reproducible results. | 10 |
gamma | [float]: The threshold to split between l(x) and g(x), check equation 2 in this Paper. Value must be in (0, 1) range. | 0.25 |
prior_weight | [float]: Smoothing factor for counts, to avoid having 0 probability. Value must be > 0. | 1.1 |
Multivariate TPE
The algorithm name in Katib is multivariate-tpe
.
Katib uses the Optuna optimization framework for its Multivariate TPE search.
Multivariate TPE is improved version of independent (default) TPE. This method finds dependencies among hyperparameters in search space.
Katib supports the following algorithm settings:
Setting name | Description | Example |
---|---|---|
n_ei_candidates | [int]: Number of Trials used to calculate the expected improvement. | 25 |
random_state | [int]: Set random_state to something other than None
for reproducible results. | 10 |
n_startup_trials | [int]: Number of initial Trials for which the random search algorithm generates hyperparameters. | 5 |
Covariance Matrix Adaptation Evolution Strategy (CMA-ES)
The algorithm name in Katib is cmaes
.
Katib uses the Goptuna or Optuna optimization framework for its CMA-ES search.
The Covariance Matrix Adaptation Evolution Strategy is a stochastic derivative-free numerical optimization algorithm for optimization problems in continuous search spaces. You can also use IPOP-CMA-ES and BIPOP-CMA-ES, variant algorithms for restarting optimization when converges to local minimum.
Katib supports the following algorithm settings:
Setting name | Description | Example |
---|---|---|
random_state | [int]: Set random_state to something other than None
for reproducible results. | 10 |
sigma | [float]: Initial standard deviation of CMA-ES. | 0.001 |
restart_strategy | [string, "none", "ipop", or "bipop", default="none"]: Strategy for restarting CMA-ES optimization when converges to a local minimum. | "ipop" |
Sobol Quasirandom Sequence
The algorithm name in Katib is sobol
.
Katib uses the Goptuna optimization framework for its Sobol’s quasirandom search.
The Sobol’s quasirandom sequence is a low-discrepancy sequence. And it is known that Sobol’s quasirandom sequence can provide better uniformity properties.
Population Based Training
The algorithm name in Katib is pbt
.
Review the population based training paper for more details about the algorithm.
The PBT service requires a Persistent Volume Claim with RWX access mode to share resources between Suggestion and Trials. Currently, Katib Experiments should have resumePolicy: FromVolume
to run the PBT algorithm. Learn more about resume policies in this guide.
Katib supports the following algorithm settings:
Setting name | Description | Example |
---|---|---|
suggestion_trial_dir | The location within the Trial container where checkpoints are saved | /var/log/katib/checkpoints/ |
n_population | Number of Trial seeds per generation | 40 |
resample_probability | null (default): perturbs the hyperparameter by 0.8 or 1.2. 0-1: resamples the original distribution by the specified probability | 0.3 |
truncation_threshold | Exploit threshold for pruning low performing seeds | 0.4 |
Use Custom Algorithm in Katib
You can add an HP tuning algorithm to Katib yourself. The design of Katib follows the ask-and-tell
pattern:
- Ask for a new set of parameters
- Walk to the Experiment and program in the new parameters
- Observe the outcome of running the Experiment
- Walk back to your laptop and tell the optimizer about the outcome 1. go to step 1
When an Experiment is created, one algorithm service as Kubernetes Deployment will be created.
Then Katib asks for new sets of parameters via GetSuggestions
gRPC call. After that, Katib
creates new Trials according to the sets and observe the outcome. When the Trials are finished,
Katib tells the metrics of the finished Trials to the algorithm, and ask another new sets.
Create a new Algorithm Service
The new algorithm needs to implement Suggestion service defined in api.proto.
One sample algorithm looks like:
from pkg.apis.manager.v1beta1.python import api_pb2
from pkg.apis.manager.v1beta1.python import api_pb2_grpc
from pkg.suggestion.v1beta1.internal.search_space import HyperParameter, HyperParameterSearchSpace
from pkg.suggestion.v1beta1.internal.trial import Trial, Assignment
from pkg.suggestion.v1beta1.hyperopt.base_service import BaseHyperoptService
from pkg.suggestion.v1beta1.internal.base_health_service import HealthServicer
# Inherit SuggestionServicer and implement GetSuggestions.
class HyperoptService(
api_pb2_grpc.SuggestionServicer, HealthServicer):
def ValidateAlgorithmSettings(self, request, context):
# Optional, it is used to validate algorithm settings defined by users.
pass
def GetSuggestions(self, request, context):
# Convert the Experiment in GRPC request to the search space.
# search_space example:
# HyperParameterSearchSpace(
# goal: MAXIMIZE,
# params: [HyperParameter(name: param-1, type: INTEGER, min: 1, max: 5, step: 0),
# HyperParameter(name: param-2, type: CATEGORICAL, list: cat1, cat2, cat3),
# HyperParameter(name: param-3, type: DISCRETE, list: 3, 2, 6),
# HyperParameter(name: param-4, type: DOUBLE, min: 1, max: 5, step: )]
# )
search_space = HyperParameterSearchSpace.convert(request.experiment)
# Convert the Trials in GRPC request to the Trials in algorithm side.
# Trials example:
# [Trial(
# assignment: [Assignment(name=param-1, value=2),
# Assignment(name=param-2, value=cat1),
# Assignment(name=param-3, value=2),
# Assignment(name=param-4, value=3.44)],
# target_metric: Metric(name="metric-2" value="5643"),
# additional_metrics: [Metric(name=metric-1, value=435),
# Metric(name=metric-3, value=5643)],
# Trial(
# assignment: [Assignment(name=param-1, value=3),
# Assignment(name=param-2, value=cat2),
# Assignment(name=param-3, value=6),
# Assignment(name=param-4, value=4.44)],
# target_metric: Metric(name="metric-2" value="3242"),
# additional_metrics: [Metric(name=metric=1, value=123),
# Metric(name=metric-3, value=543)],
trials = Trial.convert(request.trials)
#--------------------------------------------------------------
# Your code here
# Implement the logic to generate new assignments for the given current request number.
# For example, if request.current_request_number is 2, you should return:
# [
# [Assignment(name=param-1, value=3),
# Assignment(name=param-2, value=cat2),
# Assignment(name=param-3, value=3),
# Assignment(name=param-4, value=3.22)
# ],
# [Assignment(name=param-1, value=4),
# Assignment(name=param-2, value=cat4),
# Assignment(name=param-3, value=2),
# Assignment(name=param-4, value=4.32)
# ],
# ]
list_of_assignments = your_logic(search_space, trials, request.current_request_number)
#--------------------------------------------------------------
# Convert list_of_assignments to
return api_pb2.GetSuggestionsReply(
trials=Assignment.generate(list_of_assignments)
)
Build Docker Image for Algorithm Service
You should build Docker image for your Algorithm service, for that add a new Docker image under
cmd/suggestion
, for example: cmd/suggestion/hyperopt.
The new gRPC server should serve in port 6789.
After that you can build Docker image from your algorithm:
docker build . -f cmd/suggestion/<PATH_TO_DOCKER> -t <DOCKER_IMAGE>
Update the Katib Config with
Update the Katib config with the new algorithm entity:
runtime:
suggestions:
- algorithmName: random
image: docker.io/kubeflowkatib/suggestion-hyperopt:$(KATIB_VERSION)
- algorithmName: tpe
image: docker.io/kubeflowkatib/suggestion-hyperopt:$(KATIB_VERSION)
+ - algorithmName: <new-algorithm-name>
+ image: <DOCKER_IMAGE>
Contribute the Algorithm to Katib
If you want to contribute the algorithm to Katib, you could add unit test and/or e2e test for it in the CI and submit a PR.
Add Unit Tests for the Algorithm
Here is an example test_hyperopt_service.py:
import grpc
import grpc_testing
import unittest
from pkg.apis.manager.v1beta1.python import api_pb2_grpc
from pkg.apis.manager.v1beta1.python import api_pb2
from pkg.suggestion.v1beta1.hyperopt.service import HyperoptService
class TestHyperopt(unittest.TestCase):
def setUp(self):
servicers = {
api_pb2.DESCRIPTOR.services_by_name['Suggestion']: HyperoptService()
}
self.test_server = grpc_testing.server_from_dictionary(
servicers, grpc_testing.strict_real_time())
if __name__ == '__main__':
unittest.main()
You can setup the gRPC server using grpc_testing
, then define your own test cases.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.