Gradient boosting

Gradient boosting algorithms, like Random Forest, are built on decision trees. However, gradient boosting takes a different approach for constructing trees than the Random Forest algorithm.

The basic idea behind gradient boosting is to build trees sequentially rather than independently. Basically, each tree is grown to correct the errors of its predecessor. First, a simple model is used to predict the target variable. The residuals (differences between the predicted values and the true values) are then computed. For binary outcomes, we actually have “pseudo-residuals”, which are the differences between the observed outcome and the predicted probability of the positive class (i.e. the predicted probability that the outcome = 1). The next tree tries to predict the error made by the previous model. The predictions from this new tree are scaled by a factor (learning rate) and added to the existing model’s predictions. This process is like taking a step in the direction that minimizes prediction error, hence “gradient” boosting.

These steps of are repeated multiple times. Each new tree is fit to the residuals of the current combined ensemble of previous trees. As trees are added, the model becomes a weighted sum of all the trees. To prevent overfitting, gradient boosting introduces “regularization.” As we saw in Lasso, regularization is a technique used to add some form of penalty to the model, which discourages it from fitting too closely to the noise in the training data (overfitting). One common form of regularization is “shrinkage”, where each tree’s contribution is reduced by multiplying it with a small learning rate. Gradient boosting requires careful tuning of parameters such as tree depth, learning rate, and the number of trees.

XGBoost (Extreme Gradient Boosting):

The code templates you will use, use a particular gradient boosting algorithm called XGBoost. Here are its distinctive features:

Advantages of XGBoost

Disadvantages of XGBoost

Implementation of XGBoost in R

This method is implemented with the [XGBoost]{https://cran.r-project.org/web/packages/xgboost/xgboost.pdf} package in R. The code templates will do this for you.

Back to top