# Solved – OLS regression results: p-values > 0.10, how to proceed

In the Python statsmodels documentation there is an example with the goal:

We want to know whether literacy rates (Literacy column) in the 85 French departments (Departments) are associated with per capita wagers on the Royal Lottery (Lottery) in the 1820s. We need to control for the level of wealth (Wealth) in each department, and we also want to include a series of dummy variables on the right-hand side of our regression equation to control for unobserved heterogeneity due to regional effects (Region; N, E, S, W to 0 or 1). The model is estimated using ordinary least squares regression (OLS).

OLS Regression Results ============================================================================== Dep. Variable:                Lottery   R-squared:                       0.338 Model:                            OLS   Adj. R-squared:                  0.287 Method:                 Least Squares   F-statistic:                     6.636 Date:                Tue, 02 Feb 2021   Prob (F-statistic):           1.07e-05 Time:                        07:07:06   Log-Likelihood:                -375.30 No. Observations:                  85   AIC:                             764.6 Df Residuals:                      78   BIC:                             781.7 Df Model:                           6                                          Covariance Type:            nonrobust                                          ===============================================================================                   coef    std err          t      P>|t|      [0.025      0.975] ------------------------------------------------------------------------------- Intercept      38.6517      9.456      4.087      0.000      19.826      57.478 Region[T.E]   -15.4278      9.727     -1.586      0.117     -34.793       3.938 Region[T.N]   -10.0170      9.260     -1.082      0.283     -28.453       8.419 Region[T.S]    -4.5483      7.279     -0.625      0.534     -19.039       9.943 Region[T.W]   -10.0913      7.196     -1.402      0.165     -24.418       4.235 Literacy       -0.1858      0.210     -0.886      0.378      -0.603       0.232 Wealth          0.4515      0.103      4.390      0.000       0.247       0.656 ============================================================================== Omnibus:                        3.049   Durbin-Watson:                   1.785 Prob(Omnibus):                  0.218   Jarque-Bera (JB):                2.694 Skew:                          -0.340   Prob(JB):                        0.260 Kurtosis:                       2.454   Cond. No.                         371. ==============================================================================

Prob (F-statistic), 1.07e-05, thus reject null hypothesis (H0: all coefficients are equal to zero), so there is statistically significant evidence that there is a relationship between dependent and independent variables together. But only Wealth has a p-value < 0.05.

Should the model be used as is? Or should all independent variables except Wealth be removed? What should be done based on the goal "We want to know whether literacy … We need to control for the level of wealth (Wealth) in each department …"?

Contents