I've been reviewing Bayesian literature in an attempt to utilize Bayesian inference for hypothesis testing when I have very well established priors, but there's one thing I cannot get my head around:
Why is the normalizing constant unimportant in determining the posterior when using MCMC methods? I understand that the evidence does not depend upon the parameters due to integration, but how is your posterior a valid probability distribution if it does not integrate to one (which as I understand it is the function of the normalizing constant)? If it isn't a valid probability distribution (since it is merely proportional to likelihood X prior), then how is it useful?
I really need someone to explain this to me as if I were a 7 year old, or possibly a chimp of some sort because I'm having a terrible time understanding it.
Best Answer
NOT all the MCMC methods avoid the need for the normalising constant. However, many of them do (such as the Metropolis-Hastings algorithm), since the iteration process is based on the ratio $R(theta_1,theta_2)=dfrac{pi(theta_1vert x)}{pi(theta_2vert x)}$, where
$$pi(thetavert x) = dfrac{pi(xvert theta)pi(theta)}{int pi(xvert theta)pi(theta) dtheta} = dfrac{pi(xvert theta)pi(theta)}{pi(x)},$$
is the posterior distribution of $theta$ given the sample $x$. Therefore, the normalising constant $pi(x)$ in the denominator does not depend on $theta$ and it cancels out when you calculate $R(theta_1,theta_2)$. This is
$$R(theta_1,theta_2)= dfrac{pi(xvert theta_1)pi(theta_1)}{pi(xvert theta_2)pi(theta_2)},$$
which does not involve the normalising constant, only the likelihood $pi(xvert theta)$ and the prior $pi(theta)$.
Similar Posts:
- Solved – Dropping the normalization constant in Bayesian inference
- Solved – In the most basic sense, what is marginal likelihood
- Solved – Expectation and variance of the posterior distribution example: seeking elaboration on normalising constant
- Solved – Understanding the Beta conjugate prior in Bayesian inference about a frequency
- Solved – Understanding the Beta conjugate prior in Bayesian inference about a frequency