When fitting multiple variables to one outcome via the lm()
function in R, summary(lm)
gives me the p-values for individual regressors but not for the full model in an easily extractable (as in, just accessing fields) kind of way.
According to this question, it is possible to extract the p-value via summary(lm)$fstatistic
by using the command:
pf(x$fstatistic[1],x$fstatistic[2],x$fstatistic[3],lower.tail=FALSE)
However, while in the example linked this provides the same p-value as is printed, I get a different one:
> summary(model) # ... Residual standard error: 1.533 on 371 degrees of freedom (555 observations deleted due to missingness) Multiple R-squared: 0.3364, Adjusted R-squared: 0.2864 F-statistic: 6.718 on 28 and 371 DF, p-value: < 2.2e-16
and:
f = summary(model)$fstatistic > pf(f[1],f[2],f[3],lower.tail=F) value 5.948007e-20
What are possible reasons for these values to be different, and which one is the "right" one for the significance of the whole model?
Best Answer
The p-value calculated by print.summary.lm
(use getAnywhere(print.summary.lm)
to study the code) is rounded for floating point precision using format.pval
.
2.2e-16
is the value of .Machine$double.eps
, which is
the smallest positive floating-point number x such that 1 + x != 1
So, the rounding is not arbitrary, but for numerical reasons.
Similar Posts:
- Solved – White’s test in R gives different result from manual calculation
- Solved – White’s test in R gives different result from manual calculation
- Solved – Interpreting $R^2$, F-statistic & p-value of a model
- Solved – R squared change multiple linear regression
- Solved – Is an implementation of a density function for a logit-normal distribution available in R