In general inference, why orthogonal parameters are useful, and why is it worth trying to find a new parametrization that makes the parameters orthogonal ?
I have seen some textbook examples, not so many, and would be interested in more concrete examples and/or motivation.
Best Answer
In Maximum Likelihood, the term orthogonal parameters is used when you can achieve a clean factorization of a multi-parameter likelihood function. Say your data have two parameters $theta$ and $lambda$. If you can rewrite the joint likelihood:
$L(theta, lambda) = L_{1}(theta) L_{2}(lambda)$
then we call $theta$ and $lambda$ orthogonal parameters. The obvious case is when you have independence, but this is not necessary for the definition as long as factorization can be achieved. Orthogonal parameters are desirable because, if $theta$ is of interest, then you can perform inference using $L_{1}$.
When we don't have orthogonal parameters, we try to find factorizations like
$L(theta, lambda) = L_{1}(theta) L_{2}(theta, lambda)$
and perform inference using $L_1$. In this case, we must argue that the information loss due to excluding $L_{2}$ is low. This leads to the concept of marginal likelihood.
Similar Posts:
- Solved – Marginalizing over a parameter: integrate the total joint likelihood, or the each individual likelihood
- Solved – Invariance property of maximum likelihood estimator
- Solved – Invariance property of maximum likelihood estimator
- Solved – Generalized Linear Model for Weibull distribution
- Solved – Exponential distribution: Log-Likelihood and Maximum Likelihood estimator