SAS code :

`data pr; input y x1 x2 @@; cards; 5 2 2 7 3 3 3 2 0 5 2 4 4 3 3 7 2 4 ; run; proc reg data = pr; model y = x1 x2; run; quit; `

When I run the regression procedure,

I got the result

`[F-test : p-value for model = 0.3757] [t-test : p-value for x1 = 0.9238] [t-test : p-value for x2 = 0.2045] `

I'm wondering why the p-value of F-test is bigger than that of t-test for x2?

F test has the joint hypothesis H0 : beta1 = beta2 = 0

so, it has the more complicated hypothesis than t test (H0 : beta2 = 0)

My logic is this:

when hypothesis has the more restrictions, then it is far difficult to realize,

so p-value gets lower.

Where am i misunderstood?

**Contents**hide

#### Best Answer

Good question.

Now obviously the results are not comparable in the sense that the F-Test you test (in this case) two linear combinations. It is therefore outside of the scope of the t-Test.

I will therefore use $H0_F$ and $H0_t$ to discriminate the two hypothesis.

With that being said, an intuitive approach to understand what is going on is to look at what you are testing.

Your p value for the F-Tests gives you the probability for the event that the value from the F-Statistic from your data would have been observed the way it is, if the the $H0_F$ hypothesis was true (ie. if the F-Statistic really had been F-distributed), ie. $P(Data|H0_F)$. For that to be an F-Distributed statistic, your model obviously has to be correctly specified (ie. if your error is non normal, no dice) because the F-Test compares the residual variation of the two models. So we are saying, in essence, the model without our $beta$ we like better than the model with.

The p-value for the t-test assesses the probability of the the t-statistic giving you a value as such, if your $beta_i$ would really have been zero and your model was truly correct. Therefore it is essentially also $P(Data|H0_t)$

But careful! This is not a t-Test where you assess the mean of the x2 variable. It is a t-test for the estimated influence of said variable. As such, it is based on the combined distribution of $x_i$ and $y_i$, the difference being that for $H0_t$ to be true, your complete model has to be true. So your $H0_t$ really says $beta_2=0$ AND also the random influence comes from a non systematic, gaussian error term.

Phew that's a load.

Now you have the case that $P(Data|H0_t) < P(Data|H0_F)$

How can this be true? Well the intuition is that your correlation between your $x_i$ and $y_i$ is such that the model without any influence of your $x_i$ is more "likely" to be true, than that the influence of the correctly specified estimator is zero (this is not completely correct in the inference sense but let's go with that). One possible option would be that the variation of the true parameters cancel each other out a bit. Individually their influence is there, but taken together they seem more like random fluctuation in the $y$. This results in the data fitting the model without regressor influence, $H0_F$, better than the assumption that the estimator is distributed around zero.

Makes sense?

I would take this as a hint to evaluate if the model is correctly specified and to what degree things like multicollinearity are a problem.

### Similar Posts:

- Solved – Difference between F-test and T-test on OLS
- Solved – Strange results of Ljung-Box test (for white noise process)
- Solved – Test for subset of coefficients being zero
- Solved – How to represent statistical power graphically for a given hypothesis test
- Solved – How to represent statistical power graphically for a given hypothesis test