I'm reading a paper Gibbs sampling for the uninitiated.
In this paper, the authors try to use Gibbs sampling for a bayesian naive bayes model. They formalize the model as a graphical model in page 8. And in the example, they are trying to predict the emotion(sentiment) of a document.
However, what I don't understand is that, they claim without label $L$, using Gibbs sampling could still sample all the parameters needed, including $L$. I'm not sure how should I interpret this. Without training label, it's essentially a clustering problem, but if not using labels, how should we interpret the learnt label $L$?
Thanks in advance.
Too long for a comment.
In page 7 section 2, the authors clearly establish both labels "1" and "0" for the classes of their dataset. So, let's say "1" is "happy" and "0" is "sad". There you have your sentiment analysis.
Since they chose to use Naive Bayes as classifier, there are some parameters and hyperparameters to calculate in the Bayesian formulation. Such parameters are usually obtained integrating over all possible values (see 2.4.3). However, I think the point of this paper, is to show you that you can get away without calculating difficult integrals and instead, estimate conditional probabilities using Gibbs sampling (see 2.5.2).
At least, from what I have been able to look at, they're using labels to get an approximation of the joint distribution via Gibbs sampling.
- Solved – Small Probabilities in Naive Bayes
- Solved – In what conditions does naive Bayes classifier perform poorly
- Solved – Greater than 1 Naive Bayes Probabilities
- Solved – In layman’s terms, why is Naive Bayes the dominant algorithm used for text-classification
- Solved – Naive Bayes Python implementation differences