Wikipedia tells us that the score plays an important role in the Cramér–Rao inequality. It also phrases out the definition:
$$V = frac{partial}{partial theta} log{L(theta; X)}$$
However, I cannot find an intuitive explanation of what this quantity expresses. Obviously, it somehow measures how a small change of $theta$ will affect the log-likelihood of the observed data $X$, but what exactly does that mean?
The wikipedia article also mentions that the expected value $mathbb{E} [V mid theta] = 0$. Can this be interpreted somehow?
Going a bit further, in class we were told that the Fisher information (for which I have no intuitive understanding either) is $I(theta) = mathbb{E} [V^2 mid theta]$. Combined with $mathbb{E} [V mid theta] = 0$ that would imply $I(theta) = text{Var}[V]$, is this correct?
Thanks in advance.
PS: This is not homework.
Best Answer
The Wikipedia article gives an example of a Bernoulli process, with $A$ successes and $B$ failures and the probability of success $theta$, where the score is $V = frac{A}{theta}-frac{B}{1-theta}$. If $theta= frac{A}{A+B}$, i.e. $frac{theta}{1-theta}= frac{A}{B}$, then $V=0$.
The score is more positive when there are more successes than would have been expected from the value of $theta$, and more negative when there are fewer successes.
The score might be seen intuitively as a sort of measure of how close the parameter actually is to what the data suggest it might be (or the other way round if you are that way inclined), signed for the direction of the difference. The variance of the score will tend to increase with more data, so the variance is intuitively an indication of the amount of information the data will give you about the parameter.
Similar Posts:
- Solved – For a Fisher Information matrix $I(theta)$ of multiple variables, is it true that $I(theta) = nI_1(theta)$
- Solved – For a Fisher Information matrix $I(theta)$ of multiple variables, is it true that $I(theta) = nI_1(theta)$
- Solved – Connection between Fisher information and variance of score function
- Solved – Fisher’s Information for Laplace distribution
- Solved – explanation of why an UMVUE doesn’t necessarily have to achieve the CRLB