The likelihood could be defined by several ways, for instance :

the function $L$ from $Thetatimes{cal X}$ which maps $(theta,x)$ to $L(theta mid x)$ i.e. $L:Thetatimes{cal X} rightarrow mathbb{R} $.

the random function $L(cdot mid X)$

we could also consider that the likelihood is only the "observed" likelihood $L(cdot mid x^{text{obs}})$

in practice the likelihood brings information on $theta$ only up to a multiplicative constant, hence we could consider the likelihood as an equivalence class of functions rather than a function

Another question occurs when considering change of parametrization: if $phi=theta^2$ is the new parameterization we commonly denote by $L(phi mid x)$ the likelihood on $phi$ and this is not the evaluation of the previous function $L(cdot mid x)$ at $theta^2$ but at $sqrt{phi}$. This is an abusive but useful notation which could cause difficulties to beginners if it is not emphasized.

What is your favorite rigorous definition of the likelihood ?

In addition how do you call $L(theta mid x)$ ? I usually say something like "the likelihood on $theta$ when $x$ is observed".

EDIT: In view of some comments below, I realize I should have precised the context. I consider a statistical model given by a parametric family ${f(cdot mid theta), theta in Theta}$ of densities with respect to some dominating measure, with each $f(cdot mid theta)$ defined on the observations space ${cal X}$. Hence we define $L(theta mid x)=f(x mid theta)$ and the question is "what is $L$ ?" (the question is not about a general definition of the likelihood)

**Contents**hide

#### Best Answer

Your third item is the one I have seen the most often used as rigorous definition.

The others are interesting too (+1). In particular the first is appealing, with the difficulty that the sample size not being (yet) defined, it is harder to define the "from" set.

To me, the fundamental intuition of the likelihood is that it is a function of the model + its parameters, not a function of the random variables (also an important point for teaching purposes). So I would stick to the third definition.

The source of the abuse of notation is that the "from" set of the likelihood is implicit, which is usually not the case for well defined functions. Here, the most rigorous approach is to realize that after the transformation, the likelihood relates to another model. It is equivalent to the first, but still another model. So the likelihood notation should show which model it refers to (by subscript or other). I never do it of course, but for teaching, I might.

Finally, to be consistent with my previous answers, I say the "likelihood of $theta$" in your last formula.

### Similar Posts:

- Solved – likelihood function from probability mass function
- Solved – Showing that ridge regression is a solution to the following optimization problem
- Solved – Showing that ridge regression is a solution to the following optimization problem
- Solved – Proof: Pivotal Quantity
- Solved – Does the Jensen-Shannon divergence maximise likelihood