In general inference, why orthogonal parameters are useful, and why is it worth trying to find a new parametrization that makes the parameters orthogonal ?

I have seen some textbook examples, not so many, and would be interested in more concrete examples and/or motivation.

**Contents**hide

#### Best Answer

In Maximum Likelihood, the term orthogonal parameters is used when you can achieve a clean factorization of a multi-parameter likelihood function. Say your data have two parameters $theta$ and $lambda$. If you can rewrite the joint likelihood:

$L(theta, lambda) = L_{1}(theta) L_{2}(lambda)$

then we call $theta$ and $lambda$ *orthogonal parameters*. The obvious case is when you have independence, but this is not necessary for the definition as long as factorization can be achieved. Orthogonal parameters are desirable because, if $theta$ is of interest, then you can perform inference using $L_{1}$.

When we don't have orthogonal parameters, we try to find factorizations like

$L(theta, lambda) = L_{1}(theta) L_{2}(theta, lambda)$

and perform inference using $L_1$. In this case, we must argue that the information loss due to excluding $L_{2}$ is low. This leads to the concept of marginal likelihood.

### Similar Posts:

- Solved – Marginalizing over a parameter: integrate the total joint likelihood, or the each individual likelihood
- Solved – Invariance property of maximum likelihood estimator
- Solved – Invariance property of maximum likelihood estimator
- Solved – Generalized Linear Model for Weibull distribution
- Solved – Exponential distribution: Log-Likelihood and Maximum Likelihood estimator