I'm curious about why we always find mle using the first (partial) derivative without checking the end points or singular point or the second (partial) derivative? Thx a lot!
Best Answer
It's not a stupid question at all. See this post for a case where a likelihood can have two maxima and a minimum.
When dealing with maximum likelihood in a general theoretical approach, we tend to silently assume that the likelihood is a unimodal function (usually having a maximum). Moreover, many "known" distributions have log-concave densities (in their variable). This, coupled with the fact that the unknown coefficients have in many cases a linear relationship with the variable (or we can make it linear through a one-to-one parametrization, which leaves the MLE unaffected), makes the density log-concave in the unknown coefficients also… which are the arguments with respect to which we maximize the (by now, concave) log-likelihood. Satisfaction of the second-order conditions follows, in such cases.
But in more specific theoretical works, where novel log-likelihoods arise, the researcher has in my opinion the responsibility to treat specifically the issue of whether the second-order conditions are satisfied or not.
Finally, in applied work, the software algorithms check on their own whether the Hessian is negative definite at the point that they locate as stationary, (and report on the matter) so at least we know whether we have a local maximum or not.
Similar Posts:
- Solved – First order conditions for maximum likelihood estimator
- Solved – why minimize loss function instead of maximizing reward function
- Solved – When Maximising a Log Likelihood Function, Why Is It Set Equal to $0$
- Solved – Existence and uniqueness of MLE
- Solved – Log – likelihood function, why does the summation sign vanish