Friday, 3 July 2020

How Random Forest Works ?


Random forest is a supervised learning algorithm and works for both classification and regression problems.

Random forest is an ensemble classifier which is made by using multiple decision trees. Ensemble models combine the results from different models.

Application
  • Credit card fault
  • Consumer finance survey
  • Identification of disease in patients using classification 
  • Identify customer churn
How Random Forest works?
  1. Randomly select n features from N, where n << N and N are number of features
  2. For node d, calculate the best split point among the n feature
  3. Split the node among two daughter nodes using the best split
  4. Repeat first 3 steps until n number of nodes has been reached
  5. Build your forest by repeating steps 1 to 4 for D number of times where D is number of trees to be constructed
Advantages
  • Reduces overfitting compare to decision trees, that helps to improve the accuracy
  • Works on both classification and regression problems
  • Works for both continuous and categorical data
  •  Automatically treats missing value in the data
  • No need to normalize the data 
Disadvantages
  • Require high computation power as multiple trees are build during process
  • Training time is high compare to decision trees

No comments:

Post a Comment