I am reading the proof of the invariance property of MLE from Casella and Berger.
In this proof we parametrize :
$eta = tau(theta)$
There we define the induced likelihood function:
$ L_{1}^*(eta|x) = sup_{theta|tau(theta) = eta} L(theta|x) tag{1}$
I have subscripted L*($eta$|x) by 1 to differentiate between the induced likelihood of $eta $ and the Likelihood of $eta$ which are both denoted by $L^*(eta|x)$
I am not sure why this is being done. (In what follows,L* is the likelihood of $eta$ ).
If $theta_1$ and $theta_2$ are such that $tau(theta_1) = tau(theta_2)$ then $L(theta_1|x)$ = $L^*(eta = tau(theta_1)|x)$= $L^*(eta = tau(theta_2)|x)$ = $L(theta_2|x$) since $tau(theta_1)$ =$tau(theta_2)$
Hence there is no need of the supremum in (1).
Where do I misunderstand?
Best Answer
Perhaps the issues here are best understood in the context of an example. Suppose that we are interested in estimating the mean of a normal model with variance 1 i.e. we are considering models of the form $N(theta,1)$. In this case, the likelihood (for a single data point $x$) is (ignoring the constant) $L(theta | x)=exp(-(x-theta)^2/2)$.
Suppose that we are actually interested in a function of the mean, call it $eta=tau(theta)$. How to define the likelihood $L(eta|x)$? If $tau$ is invertible then we just define $L(eta|x)$ to be $L(theta=tau^{-1}(eta) | x)$ i.e. we set $theta$ equal to the unique value corresponding to the chosen value of $eta$. e.g. if $tau(theta)=2theta$ then $L(eta | x):=L(theta=frac{eta}{2} |x)$.
What if $tau$ is not invertible? e.g. $tau(theta)=theta^2$. Should $L(eta|x)$ be $L(theta=+sqrt{eta} | x)$ or should it be defined as $L(theta=-sqrt{eta} | x)$? These two values will usually be different, so the likelihood $L(eta|x)$ is undefined. Hence Casella and Berger define the induced likelihood. With the chosen definition, it turns out that the invariance property (which is obvious when $tau$ is invertible) still holds.