Solved – Can a posterior expectation be used as a approximate for the true (prior) expectation

Let's say that the likelihood of observation $x$ given a random latent variable $z$ and a model parameter $theta$ is defined as $p(x|theta, z)$.

As far as I know, if I want to obtain $p(x| theta)$, I would have to compute the expectation of $p(x|theta, z)$ with respect to $zsim p(z)$:

$p(x|theta)=mathbb E_{zsim p(z)}[p(x|theta, z)]=int p(x|theta, z)p(z)dz$

(assuming that the prior $p(z)$ has nothing to do with $theta$)

However, I know only $p(z|x)$, the posterior of $z$ given $x$, not the prior $p(z)$.

Then can I use $mathbb E_{zsim p(z|x)}[p(x|theta, z)]$ as an approximate for $p(x|theta)$, or is this just nonsense?


This is a strange and unusual setting and the question would benefit from an explanation of how the integrated posterior $p(z|x)$ is available. The question never mentions the prior distribution on $theta$, $p(theta)$, which matters in all subsequent calculations.

If both $p(x|z)$ and $p(z|x)$ are available [or approximated by converging estimators], Chib's formula [also known, as an earlier occurence, as the candidate's formula] provides the prior $p(z)$ through Bayes' formula [or the dual decomposition of a joint distribution] as $$p(z) overbrace{propto}^{text{as function of $z$}} frac{p(z|x)}{p(x|z)}$$ which can be approximated by $$frac{n,p(z|x)}{sum_{i=1}^n p(x|z,theta_i)}qquadtext{when}qquadtheta_1,ldots,theta_nsim p(theta|x,z)$$ and when only the numerator is available in closed form. (This is a special case of Rao-Blackwellisation.)

Using instead $p(z|x)$ in the integral as you propose, i.e. $$p(x|theta)approxint p(x|theta, z)p(z|x)dz$$is not coherent from a probabilistic view point since $x$ appears on both sides, i.e. as conditioned and as conditional. For instance this approximation does not integrate to one. The distinction between $p(z)$ and $p(z|x)$ is non-negligible, as shown for instance by the identity $$p(z|x) = int p(z,theta|x)text{d}theta overbrace{propto}^{text{as function of $z$}} int p(z)p(theta)p(x|theta,z)text{d}theta = p(z) int p(theta)p(x|theta,z)text{d}theta$$which involves a second function of $z$, depending on the choice of the prior distribution on $theta$.

Similar Posts:

Rate this post

Leave a Comment