Stanford Pratical Machine Learning-Boosting

本文最后更新于:1 年前

这一章主要介绍Boosting,也是另外一种模型组合方案。本质就是一种训练多个弱模型,组合成强模型。主要目的是通过多个模型的决策结果来减少偏差。弱模型 -> 强模型。

Boosting

注意和之前的Bagging不一样!这里的模型是顺序训练的!特定训练预测不好的结果!不好并行,大模型比较吃亏

  • Learn n weak learners sequentially, combine to reduce model bias
  • At step t, repeat:
    • Evaluate the existing learners’ errors $\epsilon_t$
    • Train a weak learner $\hat{f}_p$, focus on wrongly predicted examples
      • AdaBoost: Re-sample data according to $\epsilon_t$
      • Gradient boosting: Train learner to predict $\epsilon_t$
    • Additively combining existing weak learners with $\hat{f}_t$

Gradient Boosting

相当于整体的模型,每轮迭代添加了一个新的模型!!!

  • Supports arbitrary differentiable loss
  • $H_t(x)$: output of combined model at timestep t, with $H_l(x) = 0$
  • For each step t, repeat:
    • Train a new learner $\hat{f}t$ on residuals: ${(x_i, y_i - H_i(x_i)}{i = 1,\dots,m}$
    • Combine $H_{t+1}(x) = H_t(x) + \eta \hat{f}_t(x)$
      • The learning rate $\eta$ is the shrinkage parameter for regularization
  • MSE $L = \frac{1}{2}(H(x) - y) ^ 2$, , residual equals negative gradient $y - H(x) = -\frac{\part{L}}{\part{H}}$
    • Other boosting algorithms (e.g. AdaBoost) can also be replaced in the gradient descent in the function space

Gradient Boosting Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class GradientBoosting:
def __init__(self, base_learner, n_learners, learning_rate):
self.learners = [clone(base_learner) for _ in range(n_learners)]
self.lr = learning_rate

def fit(self, X, y):
residual = y.copy()
for learner in self.learners:
learner.fit(X, residual)
residual -= self.lr * learner.predict(X)

def predict(self,X):
preds = [learner.predict(X) for learner in self.learners]
return np.array(preds).sum(axis=0) * self.lr

Gradient Boosting Decision Trees (GBDT)

注意,Boosting一定是弱模型哈,强的一下拟合住了就没得玩了嘛!怎么弱?比如层数小的决策树,如果不限制层数,决策树本身是强模型昂!!!

  • Use decision tree as the weak learner
    • Regularize by a small max_depth and randomly sampling features
  • Sequentially constructing trees runs slow
    • Popular libraries use accelerated algorithms, e.g. XGBoost, lightGBM

image-20230826110105037

References

  1. slides
  2. AdaBoost
  3. GBDT
  4. XGBoot与LightGBM

Stanford Pratical Machine Learning-Boosting
https://alexanderliu-creator.github.io/2023/08/26/stanford-pratical-machine-learning-boosting/
作者
Alexander Liu
发布于
2023年8月26日
许可协议