Decision tree Algorithm (ID3)

4 min readFeb 20, 2019

This is 2nd part of Decision tree tutorial. In last part we talk about Introduction of decision tree, Impurity measures and CART algorithm for generating the tree.

PART I-DECISION TREE CART ALGORITHM

This tutorial is about another important algorithm used in generating decision tree known as ID3. It is an acronym for iterative dichotomiser 3.

Formula

It will help in building tree,

So lets get started!!!

we will take the same weather data set , we have taken for explaining CART algorithm in previous story.

It has four attributes, outlook, humidity, temperature and wind.

The attribute which will have highest information gain is selected as a node.

ID3

It uses entropy as metric.
It is used for only classification problem

Root node

From the data set,we have

Number of observations = 14
Number of observations having Decision ‘Yes’ = 9
probability of ‘Yes’ , p(Yes) = 9/14
Number of observations having Decision ‘No’ =5
probability of ‘No’ , p(No) = 5/14

As we have four attribute, outlook, temperature, humidity, and wind

Information Gain on Sunny outlook factor

Number of instance for sunny outlook factor=5
Decision =’Yes’, prob(Decision=’Yes’ | outlook=sunny)=2/5
Decision =’No’, prob(Decision=’No’ | outlook=sunny)=3/5

Summary of Information Gain for all the attribute

Gain(Decision, outlook) = 0.247
Gain(Decision, wind ) = 0.048
Gain(Decision, temperature) = 0.029
Gain(Decision, humidity) = 0.151

So, outlook has the highest information gain so it is selected as first node/ root node.

Information Gain on Temperature under Sunny outlook factor

Summary of information gain on all attribute under Sunny outlook factor

Gain(sunny, temp) = 0.57
Gain(sunny, humidity) = 0.97
Gain(sunny, wind) =0.019

we can see that information gain of ‘Humidity’ attribute is higher than other. so, it is next node under sunny outlook factor.

Information Gain on overcast outlook factor

Total number of Observation =4
Prob(Decision=’Yes’,overcast)=1
Prob(Decision=’No’,overcast)=0

Since, All the decision are ‘Yes’ , so

Entropy(Decision, Overcast)=0

So, if outlook=overcast then decision is ‘Yes’ . so, It is a leaf node. It cant be further divided.

Information Gain on humidity factor

humidity takes two value, normal and high.

From the both table, you can infer that, whenever humidity is high decision is ‘No’

And, when humidity is normal decision is ‘Yes’

Information Gain on Rainfall outlook factor

Number of instance for rainfall outlook factor=5
Decision =’Yes’, prob(Decision=’Yes’ | outlook=rainfall)=3/5
Decision =’No’, prob(Decision=’No’ | outlook=rainfall)=2/5

Information Gain on Wind under rainfall outlook factor

Entropy (rainfall, wind) =0

Gain(rainfall, wind)=0.97

Summary of information Gain on all features under rainfall outlook factor

Gain(rainfall, wind)=0.97
Gain(rainfall, temp)=0.019

So, wind attribute has highest information gain. So it is selected as next node under rainfall outlook factor.

Information Gain on wind

wind takes two value, weak and strong.

From the two table, the decision is always ‘No’ when wind is ‘strong’ and decision is always ‘Yes’ when wind is ‘weak’. So wind is also an leaf node.

So, we have explained the generation of decision tree in step by step manner.

That’s it. Thank you,