Solved – Why maximize the likelihood?

What properties of the MLE makes it so useful for picking parameters? Why would we want to maximize the likelihood? Why not maximize $P(text{Parameter} mid text{Data})$ or anything else?

Why would we want to maximize the likelihood?

Because – taking the form of the model as given – it's the set of parameters that give the best chance of giving us the sample.

Why not maximize P(Parameter∣Data) or anything else?

In the framework where you would maximize likelihood, parameters are fixed but unknown; they don't have distributions.

To have a notion that you can meaningfully calculate distributions for parameters, you're in a Bayesian way of looking at parameters … so you wouldn't have a reason to maximize likelihood at all. However, you still use likelihood:

$P(text{parameter}|text{data}) propto P(text{data}|text{parameter}) ,P(text{parameter})$

MAP estimation does what you're asking about here; it's similar to maximizing likelihood but it incorporates the effect of the prior on the parameters.

As for "anything else" — there are certainly cases where other estimators are used.

What properties of the MLE makes it so useful for picking parameters?

MLEs have many useful properties.

The big one (aside from consistency I guess) would be that they're asymptotically efficient, which means that in sufficiently large samples you can't really beat them.

They have a number of other properties that broadly speaking make them "nice" to work with (such as functional invariance — in two different senses), but efficiency is a big selling point.

Similar Posts:

Rate this post

Leave a Comment