# Solved – Distribution of sum of order statistics

The question is from a problem I am trying to solve in Robert Hogg's introduction to Mathematical Statistics 6th version problem 7.2.9 in page 380.

The problem is:

We consider a random sample \$X_1, X_2,ldots ,X_n\$ from a distribution
with pdf \$f(x;theta)=(1/theta\$)exp(\$-x/theta\$), \$0<x<infty\$.
Possibly, in a life testing situation, however, we only observe the
first r order statistics \$Y_1<Y_2<cdots <Y_r\$.

(a) Record the joint pdf of these order statistics and denote it by
\$L(theta)\$

(b) Under these conditions, find the mle, \$hat{theta}\$, by
maximizing \$L(theta)\$.

(c)Find the mgf and pdf of \$hat{theta}\$.

(d) With a slight extension of the definition of sufficiency, is
\$hat{theta}\$ a sufficient statistic?

I can solve (a) and (b) but I am completely stuck by (c) therefore cannot forward to (d)

Solve (a):

We know joint pdf for \$Y_1,Y_2,ldots,Y_n\$ is \$g(y_1,y_2,ldots,y_n)=n!f(y_1)f(y_2)cdots f(y_n)\$ we just integrate out the (r+1) to n terms we will get joint pfd for \$Y_1,Y_2,ldots,Y_r\$.

\$h(y_1,y_2,ldots,y_r)=n!f(y_1)f(y_2)cdots f(y_r)int_{n-1}^{infty} int_{n-2}^{infty}cdots int_{r+1}^{infty} int_{r}^{infty}f(y_{r+1})f(y_{r+2})cdots f(y_{n-1})f(y_n)dy_{r+1}dy_{r+2}cdots dy_{n-1}dy_{n}\$

\$=n!f(y_1)f(y_2)cdots f(y_r) int_{n-2}^{infty}cdots int_{r+1}^{infty} int_{r}^{infty}f(y_{r+1})f(y_{r+2})cdots f(y_{n-1})[1-F(y_{n-1})]dy_{r+1}dy_{r+2}cdots dy_{n-1}\$

\$=n!f(y_1)f(y_2)cdots f(y_r)int_{n-2}^{infty}cdots int_{r+1}^{infty} int_{r}^{infty}f(y_{r+1})f(y_{r+2})cdots f(y_{n-2})(-1)[1-F(y_{n-1})]dy_{r+1}dy_{r+2}cdots dy_{n-2}]d[1-F(y_{n-1})]\$

\$=n!f(y_1)f(y_2)cdots f(y_r)int_{n-2}^{infty}cdots int_{r+1}^{infty} int_{r}^{infty}f(y_{r+1})f(y_{r+2})cdots f(y_{n-2})frac{[1-F(y_{n-2})]^2}{2}dy_{r+1}dy_{r+2}cdots dy_{n-2}\$

\$=n!f(y_1)f(y_2)cdots f(y_r)frac{[1-F(y_r)]^{n-r}}{(n-r)!}\$

\$=n!frac{1}{theta}e^{frac{-y_1}{theta}}frac{1}{theta}e^{frac{-y_2}{theta}}cdots frac{1}{theta}e^{frac{-y_r}{theta}}[e^{-y_r/theta}]^{n-r}/(n-r)!\$

\$=frac{n!theta^{-r}}{(n-r)!}e^{-frac{1}{theta}[sum_{i=1}^{r}y_i+(n-r)y_r]}\$

(b)

This part is not difficult. It just a normal way to calculate mle.

\$log L(theta;y)=log frac{n!}{(n-2)!}-rlog(theta)-frac{1}{theta}[sum_{i=1}^{r}y_i+(n-r)y_r]\$
Take derivative of the log likelihood function we get:
\$partial frac{L(theta;y)}{theta}=frac{1}{theta^2}[sum_{i=1}^{r}y_i+(n-r)y_r]-rfrac{1}{theta}\$

Set the derivative to zero

We get: \$hat{theta}=frac{[sum_{i=1}^{r}y_i+(n-r)y_r]}{r}\$

(c)

To solve (c) I think we need at least to know the distribution of \$sum_{i=1}^{r}y_i\$.

I search the internet, there is a paper talk about this distribution, https://www.ocf.berkeley.edu/~wwu/articles/orderStatSum.pdf

But I think the method might not be correct since for order statistic \$F(y_i)\$ are different, we cannot use binomial distribution there.

There is another paper here http://www.jstor.org/stable/4615746?seq=1#page_scan_tab_contents

But I am totally lost at formula (2.2) if someone would like to explain the paper with more detailed calculations, it will be highly appreciated.

(d) only after solve (c)

Contents

#### Best Answer

Since\$\$(y_1,ldots,y_r)simfrac{n!theta^{-r}}{(n-r)!}e^{-frac{1}{theta}[sum_{i=1}^{r}y_i+(n-r)y_r]}mathbb{I}_{y_le y_2le ldots le y_r}\$\$you have the joint pdf of \$(y_1,ldots,y_r)\$. From there, you can deduce the pdf of \$\$s_r=sum_{i=1}^{r}y_i+(n-r)y_r,.\$\$Indeed, because the Jacobian of the transform is constant,begin{align*}f_s(y_1,ldots,y_{r-1},s_r) &propto f_Yleft(y_1,ldots,left{s_r-sum_{i=1}^{r-1}y_iright}Big/(n-r+1)right) \&propto theta^{-r} exp{-s_r/theta}mathbb{I}_{y_le y_2le ldots leleft{s_r-sum_{i=1}^{r-1}y_iright}/(n-r+1)}end{align*}implies by integration in \$y_1,ldots,y_{r-1}\$ that\$\$f_s(s_r)proptotheta^{-r} exp{-s_r/theta}s_r^{r-1}\$\$ Indeed, begin{align*} f_s(s_r)&=intcdotsint f_s(y_1,ldots,y_{r-1},s_r)text{d}y_1cdotstext{d}y_{r-1}\ &= theta^{-r} exp{-s_r/theta}intcdotsint mathbb{I}_{y_le y_2le ldots leleft{s_r-sum_{i=1}^{r-1}y_iright}/(n-r+1)}text{d}y_1cdotstext{d}y_{r-1} end{align*} leads to constraint \$y_{r-1}\$ by \$y_{r-2}le y_{r-1}\$ and by \$\$y_{r-1}le left{s_r-sum_{i=1}^{r-1}y_iright}/(n-r+1)=left{s_r-sum_{i=1}^{r-2}y_iright}/(n-r+1)-frac{y_{r-1}}{n-r+1}\$\$ which simplifies into \$\$y_{r-1}le left{s_r-sum_{i=1}^{r-2}y_iright}/(n-r+2)\$\$ If one starts integrating in \$y_{r-1}\$, the most inner integral is begin{align*}int_{y_{r-2}}^{{s_r-sum_{i=1}^{r-2}y_i}/(n-r+2)}text{d}y_{r-1}&=left{s_r-sum_{i=1}^{r-2}y_iright}/(n-r+2)-y_{r-2}\ &=left{s_r-sum_{i=1}^{r-3}y_iright}/(n-r+2)-frac{(n-r+1)y_{r-2}}{n-r+2} end{align*} and from there one can proceed by recursion.

Hence\$\$s_rsimmathcal{G}a(r,1/theta)\$\$

Here is an R simulation to show the fit: obtained as follows

``n=10 r=5     sim=matrix(rexp(n*1e4),1e4,n) sim=t(apply(sim,1,sort)) res=apply(sim[,1:r],1,sum)+(n-r)*sim[,5] hist(res,prob=TRUE) curve(dgamma(x,sh=(n-r),sc=1),add=TRUE) ``

Rate this post