What is AdaBoost?

AdaBoost is a classification algorithm in machine learning, which is an advanced version of decision trees. In AdaBoost, the tree is split into a single level called a decision stump. We train our model and try to classify the dataset based on the stump tree.

Stump tree
Stump tree

Overview

In AdaBoost, we assign equal weight to each data point in the data set. Then points that are wrongly classified are assigned a higher weight, so these points must be given importance in the next model. By this, we will keep training the models until there are no misclassifications by the models. Have a look at the flow diagram of the AdaBoost.

Flow of AdaBoost
Flow of AdaBoost

This is how one model's weakness (miss classification) is transferred to other models, so the next model should give importance to the miss classified points.

Now let us discuss the algorithm in detail with an example.

Working of algorithm

Before diving into the depth of the working of the algorithm, let us consider the following dataset.

No.

Gender

Age

Income (In dollars)

Sickness

1

Male

41

400

1

2

Male

54

300

0

3

Female

42

250

0

4

Female

40

600

1

5

Male

46

500

1

We have selected the small data set because it will be easy to understand the working of the algorithm. The points of the algorithm are as follow.

  • We will assign equal weights to each data point in the first step. The formula to assign the sample weight is as follows:

  Here the N=5N=5, so the sample weight in this case will be 15\frac{1}{5}.

  • We need to find the criteria for classification. For this, we need to find the Gini indexprobability of a feature classified incorrectly. for each feature in the data set. Our first stump will be that feature with less Gini index value.

  • Let us say that the Gini index for gender is small compared to age and income. So our first stump will be gender.

  • We can calculate the importance of the classifier with the help of the following equation:

  • Let us consider that model miss classified one data point. By this error is 1/51/5.So the α\alpha is calculated as:

  The value of α\alpha is between 0 and 1.

  • It is necessary to calculate the value of α\alpha and error because we have to update the weights for the next model so that it cannot predict the same output as before.

  • Weights in the AdaBoost are updated with the help of the equation given below.

  • The value of α\alpha is greater than 0 if it is miss classified. Otherwise, it is less than 0.

  • By putting the value of previousWeightspreviousWeights which was 1/51/5 and the value of α\alpha as calculated 0.69 in our equation.

  • The weights for the miss classified are given as:

No.

Gender

Age

Income (In dollars)

Sickness

Previous sample weight

Updated sample weight

1

Male

41

400

1

1/5

0.1004

2

Male

54

300

0

1/5

0.1004

3

Female

42

250

0

1/5

0.1004

4

Female

40

600

1

1/5

0.3988

5

Male

46

500

1

1/5

0.1004

  • We need to normalize the sample weights as their sum must be 1.

No.

Gender

Age

Income (In dollars)

Sickness

Previous sample weight

Updated sample weight

1

Male

41

400

1

1/5

0.1254

2

Male

54

300

0

1/5

0.1254

3

Female

42

250

0

1/5

0.1254

4

Female

40

600

1

1/5

0.4982

5

Male

46

500

1

1/5

0.1254

  • We will pass this information to the next model; the next model will train based on updated sample weight.

  • We will keep iterating the steps until there is no miss classified point based on the Gini index. As shown below, we will stop when our model can classify the points correctly.

No.

Gender

Age

Income (In dollars)

Sickness

Previous sample weight

Updated sample weight

1

Male

41

400

1

1/5

0.1254

2

Male

54

300

0

1/5

0.1254

3

Female

42

250

0

1/5

0.1254

4

Female

40

600

1

1/5

0.4982

5

Male

46

500

1

1/5

0.1254

Conclusion

AdaBoost is a powerful model for binary classification. It reduces the chances of underfitting and overfitting. If you need reinforcement about your concepts of AdaBoost, then revise the concept of decision trees and random forests.

Q

How does AdaBoost assign weights to the training instances?

A)

It assigns equal weights to all instances.

B)

It assigns higher weights to instances that are misclassified.

C)

It assigns higher weights to instances that are correctly classified.

D)

It assigns weights based on the feature importance.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved