In machine learning while training an
algorithm different parameters are need to be passed to get the best fit and
accuracy from the model, therefore selecting the values of different parameters
such as n_estimators and max_depth, etc that are involved in training of model
is called hyper parameter tuning.
For example in tuning decision tree
multiple parameters are present as per below
tree.DecisionTreeClassifier(criterion='gini',
splitter='best', max_depth=None,
min_samples_split=2, min_samples_leaf=1,
min_weight_fraction_leaf=0.0,
max_features=None,
random_state=None, max_leaf_nodes=None, min_impurity_decrease=0.0,
min_impurity_split=None,
class_weight=None, presort=False)
We need to choose a combination for above parameters that gives the best
accuracy for the model. In order to select the values of these variables we
generally follow two methods
There are two ways of hyper parameter tuning
1) Grid Search
This is one of the basic method for hyper parameter tuning, in this
method all possible combinations of different parameters are passed and best
combination is chosen for model building.
2) Random Search
In this method, randomly chosen values are passed for different
parameters and combination is chosen from same.
This method consumes less time as compare to grid search as we are not
passing all the possible values but we are passing randomly chosen statistical
values.