Considering the data set given below
Here if we have to classify new data point:
D15 (O=Overcast, T=Cool, H=High, W=Strong)
Then for P(No|Overcast, Cool, High, Strong)
we have,
(5/14) * 0 * (1/5) * (4/5) * (3/5)
This results to 0
So I read that this situation needs smoothing. But what I couldn't figure out is why do we need to smooth this data and how to smooth this data.
Also, does smoothing give better predictions?
Could you please explain me how Laplace smoothing works on this case?
I can find some articles in google but non of them were explained in plain simple manner, such that it would help a beginner like me understand it easily.
Best Answer
Actually the above answer is a little incorrect in that, when we are adding 1 to a zero element, we should also divide by P(Y)+1 so that would be:
$frac{5}{14} cdot frac{0+1}{5+1} cdot frac{1+1}{5+1} cdot frac{4+1}{5+1} cdot frac{3+1}{5+1} = 0.011$
Similar Posts:
- Solved – Why the probability greater than 100% when I use Naive Bayesian for classification
- Solved – Is a Bayesian network doing feature selection
- Solved – Is a Bayesian network doing feature selection
- Solved – Prediction using Naive Bayes of klaR package fails
- Solved – How to deal with label switching issues with Matlab’s classification trees