Solved – Sampling from marginal distribution using conditional distribution

I want to sample from a univariate density $f_X$ but I only know the relationship:

$$f_X(x) = int f_{Xvert Y}(xvert y)f_Y(y) dy.$$

I want to avoid the use of MCMC (directly on the integral representation) and, since $f_{Xvert Y}(xvert y)$ and $f_Y(y)$ are easy to sample from, I was thinking of using the following sampler:

  1. For $j=1,dots, N$.
  2. Sample $y_j sim f_Y$.
  3. Sample $x_j sim f_{Xvert Y}(cdotvert y_j)$.

Then, I will end up with the pairs $(x_1,y_1),…,(x_N,y_N)$, and take only the marginal samples $(x_1,dots,x_N)$. Is this correct?

Yes, this is correct. Basically, you have

$$f_{X,Y}(x,y) = f_{X|Y}(x|y) f_Y(y),$$

and as you said, you can sample from the joint density. Picking up just the $x$s from the samples leads you to a sample from the marginal distribution.

This is because the act of ignoring the $y$ is akin to integrating over it. Lets understand this with an example.

Suppose $X$ = Height of mothers and $Y$ = Height of daughter. The goal is to get a sample from $(X,Y)$ to understand the relation between the heights of daughters and their mothers. (I am making the assumption that there is only one daughter in the family, and restricting the population to all daughters over age 18 to ensure full growth).

You go out and get a representative sample $$(x_1, y_1), dots, (x_N, y_N). $$

Thus for each mother, you have the height of their daughter. There should be a clear relationship between $X$ and $Y$. Now suppose from your dataset, you ignore all the data on the daughters (drop the $Y$), then what do you have? You have exactly heights of randomly chosen mothers which will be $N$ draws from the marginal of $X$.

Similar Posts:

Rate this post

Leave a Comment