Solved – Integral of a conditional uniform distribution leads to improper integral

I have two uniforms distributions, $X_1 simit{U}(a,b)$ and $X_2simit{U}(X_1+delta,b+delta)$. I would like to compute $P(X_2in[a+delta,b+delta])$. So I do this:

P(X_2in[a+delta,b+delta]) & = & int_{a}^{b} P(X_2in[a+delta,b+delta]|X_1=y)cdot P(X_1=y) dy\
& = & frac{1}{b-a} int_{a}^{b} frac{1}{b-y} dy\
& = & frac{1}{b-a} lnBigg(frac{b-a}{b-b}Bigg)

The wrong result is related to the fact that a uniform distribution has probability $frac{1}{x}$, but it doesn't work for $x<1$ (in this case, the fraction becomes bigger than 1…).

How do we solve this problem formally, in a clean way?

You can see that the answer is $1$ without any calculation. By definition, you have $a le X_1 le b$ and $X_1+delta le X_2 le b+delta$, so: $$begin{align} X_2 &ge X_1 + delta ge a + delta quad text{ and }\ X_2 &le b + delta end{align}$$ So $Pr(X_2 in [a+delta, b+delta]) = 1$.

A uniform distribution over an interval of length $l > 0$ has density $1/l$ at every point. When you integrate this $1/l$ over the interval you get $1$ as you should; whether $l < 1$ or $l > 1$ is irrelevant. Also, the actual probability of taking any particular exact value is $0$, not $1/l$ as you seem to think. (Your $P(X_1=y)$ is $0$ for any $y$.) For continuous variables, we need to work with the probability density function, not the probability mass function.

So with all that in mind, the correct calculation would be as follows. The density $f_1(x)$ of $X_1$ is $frac{1}{b-a}$ for $x in (a,b)$, and $0$ outside. The density $f_2(x)$ of $X_2$, for a given value of $X_1$ (so let's write it as $f_{2,X_1}(x)$ actually), is $frac{1}{b-X_1}$ for $x in (X_1+delta, b+delta)$, and $0$ outside. To get this density function as a value of $x$ alone, without depending on the value of $X_1$, you need to integrate over all values $y$ of $X_1$: that is, $$f_2(x) = int_{y} f_{2,y}(x)f_1(y) dy.$$

Now note that for the second factor $f_1(y)$ to be nonzero, you need $a le y le b$ as we said above. For the first factor $f_{2,y}(x)$ to be nonzero, you need $y+delta le x le b+delta$, so you also need $y le x – delta$. As you can check that $a+delta le x le b+delta$, you have $x – delta le b$, so the true bounds on $y$ are $a le y le min(b, x-delta)$, i.e., $a le y le x-delta$. So

$$begin{align}f_2(x) &= int_{a}^{x-delta} f_{2,y}(x)f_1(y) dy \ &= int_{a}^{x-delta} frac{1}{b-y} frac{1}{b-a} dy \ &= frac{1}{b-a} lnfrac{b-a}{b-(x-delta)} end{align}$$ for $x in (a+delta, b+delta)$, and $0$ outside.

The fact that this $f_2(x)$ varies with $x$ shows that $X_2$ is not uniformly distributed. However, when you integrate over the entire region $(a+delta, b+delta)$, you get $1$ as you should, so it is a valid distribution: $P(X_2 in (a+delta, b+delta))$ is

$$begin{align} int_{a+delta}^{b+delta} f_2(x) dx &= int_{a+delta}^{b+delta} frac{1}{b-a} lnfrac{b-a}{b-(x-delta)} dx \ &= int_{a}^{b} frac{1}{b-a} lnfrac{b-a}{b-u} du quad text{ substituting } u=x-delta\ &= ln(b-a) – frac{1}{b-a} int_{a}^{b}ln(b-u) du \ &= ln(b-a) – frac{1}{b-a} int_{0}^{b-a} ln t dt quad text{ substituting } t=b-u \ &= ln(b-a) – frac{1}{b-a} ((b-a)ln(b-a) – (b-a)) \ &= 1 end{align}$$

Similar Posts:

Rate this post

Leave a Comment