Thank you for your support.

1 min readFeb 8, 2020

Suppose we have a training set with an attribute “age” which contains following values.

Age : 10, 11, 16, 18, 20, 35

Now at a node, the algorithm will consider following possible splitting

Age <=10 & Age>10
Age <=11 & Age>11
Age <=16 & Age>16
Age <=18 & Age>18
Age <=20 & Age>20

You can see that if there are N possible values, we would have to consider N-1 possible splits.
And note that we do not choose the mid-point between values as the splitting threshold.

other approach like Binning is also used.lets understand from above example.

sort the values
Decide the number of bin. here consider numbers of bin =3

bin1 = [10,11], bin2 = [16,18], bin3 = [20,35]

basically we are categorising a real value data to look like a categorical data. But it is not at all categorisation as it changes after every time you perform splitting and values also changes.

Written by MLMath.io

Responses (1)