Misclassification Error
Gini Index
Entropy
P(i|t) what fraction of observation that are in class "i"
slide--13
MC_a = 25%
MC_b = 40%
MC_c = either way we get misclassify 50%, *note take 25 from both branch*
First split, we split on A cuz it has lowest MisClassification error.
Gini Index
Gini_a1 = 0.3444
Gini_a2 = 0.489
T = 2+, 3- 1- 2/5^2 - 3/5^2 = .48 5/9*0.48
F = 2+, 2- 1- 2/2^2 - 2/2^2 = .5 4/9*0.5
Gini Index is better on first split. while Misclassification show same result on 1st split.
Entropy
logP(j|t) / log 2
+:4 +:0
-:3 -:3
7/10(-4/7*log_2*4/7 - 3/7*log_2*3/7)
+
3/10(-0*log_2*0 - 1*log_2*1)
---------------------------------------
0.69
Split on A: Entropy = 0.69, InfoGain: 0.28
Split on B: Entropy = 0.71, InfoGain: 0.26
No comments:
Post a Comment