Solved – UMVUE of $frac{theta}{1+theta}$ while sampling from $text{Beta}(theta,1)$ population


Let $(X_1,X_2,ldots,X_n)$ be a random sample from the density $$f_{theta}(x)=theta x^{theta-1}mathbf1_{0<x<1}quad,,theta>0$$

I am trying to find the UMVUE of $frac{theta}{1+theta}$.

The joint density of $(X_1,ldots,X_n)$ is

begin{align}
f_{theta}(x_1,cdots,x_n)&=theta^nleft(prod_{i=1}^n x_iright)^{theta-1}mathbf1_{0<x_1,ldots,x_n<1}
\&=expleft[(theta-1)sum_{i=1}^nln x_i+nlntheta+ln (mathbf1_{0<x_1,ldots,x_n<1})right],,theta>0
end{align}

As the population pdf $f_{theta}$ belongs to the one-parameter exponential family, this shows that a complete sufficient statistic for $theta$ is $$T(X_1,ldots,X_n)=sum_{i=1}^nln X_i$$

Since $E(X_1)=frac{theta}{1+theta}$, at first thought, $E(X_1mid T)$ would give me the UMVUE of $frac{theta}{1+theta}$ by the Lehmann-Scheffe theorem. Not sure if this conditional expectation can be found directly or one has to find the conditional distribution $X_1mid sum_{i=1}^nln X_i$.

On the other hand, I considered the following approach:

We have $X_istackrel{text{i.i.d}}{sim}text{Beta}(theta,1)implies -2thetaln X_istackrel{text{i.i.d}}{sim}chi^2_2$, so that $-2theta, Tsimchi^2_{2n}$.

So $r$th order raw moment of $-2theta,T$ about zero, as calculated using the chi-square pdf is $$E(-2theta,T)^r=2^rfrac{Gammaleft(n+rright)}{Gammaleft(nright)}qquad ,,n+r>0$$

So it seems that for different integer choices of $r$, I would get unbiased estimators (and UMVUEs) of different integer powers of $theta$. For example, $Eleft(-frac{T}{n}right)=frac{1}{theta}$ and $Eleft(frac{1-n}{T}right)=theta$ directly give me the UMVUE's of $frac{1}{theta}$ and $theta$ respectively.

Now, when $theta>1$ we have $frac{theta}{1+theta}=left(1+frac{1}{theta}right)^{-1}=1-frac{1}{theta}+frac{1}{theta^2}-frac{1}{theta^3}+cdots$.

I can definitely get the UMVUE's of $frac{1}{theta},frac{1}{theta^2},frac{1}{theta^3}$ and so on. So combining these UMVUE's I can get the required UMVUE of $frac{theta}{1+theta}$. Is this method valid or should I proceed with the first method? As UMVUE is unique when it exists, both should give me the same answer.

To be explicit, I am getting $$Eleft(1+frac{T}{n}+frac{T^2}{n(n+1)}+frac{T^3}{n(n+1)(n+2)}+cdotsright)=1-frac{1}{theta}+frac{1}{theta^2}-frac{1}{theta^3}+cdots$$

That is, $$Eleft(sum_{r=0}^infty frac{T^r}{n(n+1)…(n+r-1)}right)=frac{theta}{1+theta}$$

Is it possible that my required UMVUE is $displaystylesum_{r=0}^infty frac{T^r}{n(n+1)…(n+r-1)}$ when $theta>1$?

For $0<theta<1$, I would get $g(theta)=theta(1+theta+theta^2+cdots)$, and so the UMVUE would differ.


Having been convinced that the conditional expectation in the first approach could not be found directly, and since $E(X_1mid sumln X_i=t)=E(X_1mid prod X_i=e^t)$, I had proceeded to find the conditional distribution $X_1mid prod X_i$. For that, I needed the joint density of $(X_1,prod X_i)$.

I used the change of variables $(X_1,cdots,X_n)to (Y_1,cdots,Y_n)$ such that $Y_i=prod_{j=1}^i X_j$ for all $i=1,2,cdots,n$. This lead to the joint support of $(Y_1,cdots,Y_n)$ being $S={(y_1,cdots,y_n): 0<y_1<1, 0<y_j<y_{j-1} text{ for } j=2,3,cdots,n}$.

The jacobian determinant turned out to be $J=left(prod_{i=1}^{n-1}y_iright)^{-1}$.

So I got the joint density of $(Y_1,cdots,Y_n)$ as $$f_Y(y_1,y_2,cdots,y_n)=frac{theta^n, y_n^{theta-1}}{prod_{i=1}^{n-1}y_i}mathbf1_S$$

Joint density of $(Y_1,Y_n)$ is hence $$f_{Y_1,Y_n}(y_1,y_n)=frac{theta^n,y_n^{theta-1}}{y_1}int_0^{y_{n-2}}int_0^{y_{n-3}}cdotsint_0^{y_1}frac{1}{y_3y_4…y_{n-1}}frac{mathrm{d}y_2}{y_2}cdotsmathrm{d}y_{n-2},mathrm{d}y_{n-1}$$

Is there a different transformation I can use here that would make the derivation of the joint density less cumbersome? I am not sure if I have taken the correct transformation here.


Based on some excellent suggestions in the comment section, I found the joint density of $(U,U+V)$ instead of the joint density $(X_1,prod X_i)$ where $U=-ln X_1$ and $V=-sum_{i=2}^nln X_i$.

It is immediately seen that $Usimtext{Exp}(theta)$ and $Vsimtext{Gamma}(n-1,theta)$ are independent.

And indeed, $U+Vsimtext{Gamma}(n,theta)$.

For $n>1$, joint density of $(U,V)$ is $$f_{U,V}(u,v)=theta e^{-theta u}mathbf1_{u>0}frac{theta^{n-1}}{Gamma(n-1)}e^{-theta v}v^{n-2}mathbf1_{v>0}$$

Changing variables, I got the joint density of $(U,U+V)$ as

$$f_{U,U+V}(u,z)=frac{theta^n}{Gamma(n-1)}e^{-theta z}(z-u)^{n-2}mathbf1_{0<u<z}$$

So, conditional density of $Umid U+V=z$ is $$f_{Umid U+V}(umid z)=frac{(n-1)(z-u)^{n-2}}{z^{n-1}}mathbf1_{0<u<z}$$

Now, my UMVUE is exactly $E(e^{-U}mid U+V=z)=E(X_1mid sum_{i=1}^nln X_i=-z)$, as I had mentioned right at the beginning of this post.

So all left to do is to find $$E(e^{-U}mid U+V=z)=frac{n-1}{z^{n-1}}int_0^z e^{-u}(z-u)^{n-2},mathrm{d}u$$

But that last integral has a closed form in terms of incomplete gamma function according to Mathematica, and I wonder what to do now.

It turns out that both approaches (my initial attempt and another based on suggestions in the comment section) in my original post give the same answer. I will outline both methods here for a complete answer to the question.

Here, $text{Gamma}(n,theta)$ means the gamma density $f(y)=frac{theta^n}{Gamma(n)}e^{-theta y}y^{n-1}mathbf1_{y>0}$ where $theta,n>0$, and $text{Exp}(theta)$ denotes an exponential distribution with mean $1/theta$, ($theta>0$). Clearly, $text{Exp}(theta)equivtext{Gamma}(1,theta)$.

Since $T=sum_{i=1}^nln X_i$ is complete sufficient for $theta$ and $mathbb E(X_1)=frac{theta}{1+theta}$, by the Lehmann-Scheffe theorem $mathbb E(X_1mid T)$ is the UMVUE of $frac{theta}{1+theta}$. So we have to find this conditional expectation.

We note that $X_istackrel{text{i.i.d}}{sim}text{Beta}(theta,1)implies-ln X_istackrel{text{i.i.d}}{sim}text{Exp}(theta)implies-Tsimtext{Gamma}(n,theta)$.

Method I:

Let $U=-ln X_1$ and $V=-sum_{i=2}^nln X_i$, so that $U$ and $V$ are independent. Indeed, $Usimtext{Exp}(theta)$ and $Vsimtext{Gamma}(n-1,theta)$, implying $U+Vsimtext{Gamma}(n,theta)$.

So, $mathbb E(X_1mid sum_{i=1}^nln X_i=t)=mathbb E(e^{-U}mid U+V=-t)$.

Now we find the conditional distribution of $Umid U+V$.

For $n>1$ and $theta>0$, joint density of $(U,V)$ is

begin{align}f_{U,V}(u,v)&=theta e^{-theta u}mathbf1_{u>0}frac{theta^{n-1}}{Gamma(n-1)}e^{-theta v}v^{n-2}mathbf1_{v>0}\&=frac{theta^n}{Gamma(n-1)}e^{-theta(u+v)}v^{n-2}mathbf1_{u,v>0}end{align}

Changing variables, it is immediate that the joint density of $(U,U+V)$ is $$f_{U,U+V}(u,z)=frac{theta^n}{Gamma(n-1)}e^{-theta z}(z-u)^{n-2}mathbf1_{0<u<z}$$

Let $f_{U+V}(cdot)$ be the density of $U+V$. Thus conditional density of $Umid U+V=z$ is begin{align}f_{Umid U+V}(umid z)&=frac{f_{U,U+V}(u,z)}{f_{U+V}(z)}\&=frac{(n-1)(z-u)^{n-2}}{z^{n-1}}mathbf1_{0<u<z}end{align}

Therefore, $displaystylemathbb E(e^{-U}mid U+V=z)=frac{n-1}{z^{n-1}}int_0^z e^{-u}(z-u)^{n-2},mathrm{d}u$.

That is, the UMVUE of $frac{theta}{1+theta}$ is $displaystylemathbb E(X_1mid T)=frac{n-1}{(-T)^{n-1}}int_0^{-T} e^{-u}(-T-u)^{n-2},mathrm{d}utag{1}$

Method II:

As $T$ is a complete sufficient statistic for $theta$, any unbiased estimator of $frac{theta}{1+theta}$ which is a function of $T$ will be the UMVUE of $frac{theta}{1+theta}$ by the Lehmann-Scheffe theorem. So we proceed to find the moments of $-T$, whose distribution is known to us. We have,

$$mathbb E(-T)^r=int_0^infty y^rtheta^nfrac{e^{-theta y}y^{n-1}}{Gamma(n)},mathrm{d}y=frac{Gamma(n+r)}{theta^r,Gamma(n)},qquad n+r>0$$

Using this equation we obtain unbiased estimators (and UMVUE's) of $1/theta^r$ for every integer $rge1$.

Now for $theta>1$, we have $displaystylefrac{theta}{1+theta}=left(1+frac{1}{theta}right)^{-1}=1-frac{1}{theta}+frac{1}{theta^2}-frac{1}{theta^3}+cdots$

Combining the unbiased estimators of $1/theta^r$ we obtain $$mathbb Eleft(1+frac{T}{n}+frac{T^2}{n(n+1)}+frac{T^3}{n(n+1)(n+2)}+cdotsright)=1-frac{1}{theta}+frac{1}{theta^2}-frac{1}{theta^3}+cdots$$

That is, $$mathbb Eleft(sum_{r=0}^infty frac{T^r}{n(n+1)…(n+r-1)}right)=frac{theta}{1+theta}$$

So assuming $theta>1$, the UMVUE of $frac{theta}{1+theta}$ is $g(T)=displaystylesum_{r=0}^infty frac{T^r}{n(n+1)…(n+r-1)}tag{2}$


I am not certain about the case $0<theta<1$ in the second method.

According to Mathematica, equation $(1)$ has a closed form in terms of the incomplete gamma function. And in equation $(2)$, we can express the product $n(n+1)(n+2)…(n+r-1)$ in terms of the usual gamma function as $n(n+1)(n+2)…(n+r-1)=frac{Gamma(n+r)}{Gamma(n)}$. This perhaps provides the apparent connection between $(1)$ and $(2)$.

Using Mathematica I could verify that $(1)$ and $(2)$ are indeed the same thing.

Similar Posts:

Rate this post

Leave a Comment