I've been reading up alot on the AIC value for GLM models and it has come to my attention that pretty much all of my litterature claims that AIC penalizes the model with too many variables without mensioning what the penalty actually is.

Is there anyone here who cares to explain to me how AIC penalizes models with too many variables? How does AIC show that an arbitrary model with for example 3 explanatory variables is better than a model with say 7 variables?

**Contents**hide

#### Best Answer

The definition of AIC is

$$ mathrm{AIC} = 2k – 2 ln (hat{L}) $$

Where $hat{L}$ is the likelihood of the model and $k$ is the number of parameters. Lower AIC values indicate better fits – higher likelihoods with fewer parameters.

explain to me how AIC penalizes models with too many variables?

The $2k$ term in AIC means that the AIC will go up by 2 for every additional parameter estimated. This is how AIC penalizes models for adding extra terms.

How does AIC show that an arbitrary model with for example 3 explanatory variables is better than a model with say 7 variables?

When comparing a simple model to a complex model, the log likelihood of the complex model must be greater than the log likelihood of the simple model by *at least* the number of additional parameters for the AIC to go down, indicating that the more complex model is a better fit.

In practice, a rule of thumb is often used: *if the change in AIC is less than 2, the difference in fit is negligible; if the change is more than 10 there is strong evidence in support of the model with lower AIC*. Using the "strong evidence" threshold of 10, a more complex model would need to improve the log likelihood by at least 5 per additional parameter for the complexity to be justified.

Other metrics, such as corrected AIC (AICc), also take the number of observations into account. You can browse some highly-voted question on AIC for lots of interesting discussion.

### Similar Posts:

- Solved – How would you determine the p and q lags in ARIMA models
- Solved – Why BIC tends to choose model with less parameters than AIC
- Solved – Why does -2log-likelihood decrease when you add new variables
- Solved – how to compare linear and nonlinear regression models in goodness of fit
- Solved – AIC Calculation using log likelihood