For OLS parameter estimates to be consistent it must be the case that

E(u|x)=0. Is it true?

E(u|x)=0 is a required condition for unbiasedness. But as far as I understand, unbiasedness does not necessarily mean consistency. Therefore I am really confused.

**Contents**hide

#### Best Answer

Ok. The model is, in matrix notation and conformable dimensions $$mathbf y = mathbf Xbeta + mathbf u $$

The $OLS$ estimator is

$$hat beta = (mathbf X'mathbf X)^{-1}mathbf X' mathbf y = (mathbf X'mathbf X)^{-1}mathbf X' (mathbf Xbeta + mathbf u) $$

$$= (mathbf X'mathbf X)^{-1}mathbf X' mathbf Xbeta + (mathbf X'mathbf X)^{-1}mathbf X'mathbf u = beta + (mathbf X'mathbf X)^{-1}mathbf X'mathbf u$$

For consistency we examine

$$operatorname{plim}hat beta = operatorname{plim}beta + operatorname{plim}left[(mathbf X'mathbf X)^{-1}mathbf X'mathbf uright] = beta + operatorname{plim}left[left(frac 1nmathbf X'mathbf Xright)^{-1}left(frac 1nmathbf X'mathbf uright)right] $$

And here is the crucial point that makes us need a weaker assumption for consistency compared to unbiasedness: for unbiasedness we would face $Eleft[(mathbf X'mathbf X)^{-1}mathbf X'mathbf uright]$, and in order to "insert" the expected value into the expression we have to condition on $mathbf X$, which leads us to the expression $E(mathbf umid mathbf X)$ and the need to assume *it* as being equal to zero, i.e. assume "mean-independence" between the error term and the regressors.

But $operatorname{plim}$ is a more "flexible" operator than $E$: under $operatorname{plim}$ expressions and products can be decomposed (something that under the expected value requires independence), and also $operatorname{plim}$ can "go inside the expression" (while $E$ cannot except if it is an affine function), as long as the function is a continuous transformation (and it very rarely isn't) – so

$$operatorname{plim}left[left(frac 1nmathbf X'mathbf Xright)^{-1}left(frac 1nmathbf X'mathbf uright)right] = operatorname{plim}left(frac 1nmathbf X'mathbf Xright)^{-1}operatorname{plim}left(frac 1nmathbf X'mathbf uright)$$

For consistency we need to assume that the first $operatorname{plim}$ is finite -but this is an assumption on the properties of the regressor matrix, unrelated to the error term. So we are left with the second $operatorname{plim}$ which, written for clarity using sums it is $$operatorname{plim}left(frac 1nmathbf X'mathbf uright) = left[begin{matrix} operatorname{plim}frac 1nsum_{i=1}^nx_{1i}u_i \ .\ .\ operatorname{plim}frac 1nsum_{i=1}^nx_{ki}u_i \ end{matrix}right] rightarrowleft[begin{matrix} frac 1nsum_{i=1}^nE(x_{1i}u_i) \ .\ .\ frac 1nsum_{i=1}^nE(x_{ki}u_i) \ end{matrix}right] $$ …the last transformation due to the usual assumptions that permit the application of the law of large numbers.

Exactly because we have been able to "separate" $(mathbf X'mathbf X)^{-1}$ from $mathbf X'mathbf u$ (due to the fact that we are examining the $operatorname{plim}$ and not $E$) we ended up looking only at the contemporaneous relation between each regressor and the error term. And so what we need to assume for consistency of the $OLS$ estimator is only that $E(x_{1i}u_i) =0 ; forall k, ; forall i$, (contemporaneous uncorrelatedness) which is much weaker than $E(mathbf umid mathbf X)$, the latter requiring *mean-independence*, and moreover, *not only* contemporaneous independence, but across time too (since we condition the whole error vector on the whole regressor matrix).