Solved – the difference between bias and inconsistency

I am trying to learn about bias in simple linear regression. Specifically, I want to see what happens when the $cov(e,x) = 0$ assumption of the simple regression is violated.

If this assumption is violated, I arrive at

hat{beta}_1 rightarrow beta_1 + frac{{cov(e,x)}}{{var(x)}}.

This derivation is from this web page (equations 1 through 6). The web page says

If cov(e,x) == 0, the OLS estimator is inconsistent, i.e. its value does not converge to the true value
of the parameter with the sample size. Moreover, the OLS estimator is biased.

To me, it is clear that $hat{beta}_1$ converges to a value that is not the true value $beta_1$, so that makes it biased. However, the web page seems to conclude that this makes it inconsistent. Somehow, they conclude that the estimator is biased, but I am not sure (they simply use "moreover").

So here are my questions:

  1. What is the difference between bias and inconsistency in this case? (When they conclude that it is inconsistent, I conclude that it is biased.)
  2. Does it ever make sense to say that $beta_1$ is biased? Or, can only an estimator $hat{beta_1}$ be biased?
  3. If the $cov(e,x) = 0$ assumption is violated, how can I find out what happens to the variance of $hat{beta}_1$? Can I tell if it increases or decreases?

EDIT To clarify question 3, I am wondering if there is a proof/argument for:

When the second assumption ($cov(e,x) = 0$) of Ordinary Least Squares is violated, the variance of $hat{beta_1}$ changes.

The only answer I can think of is using the result from omitted-variable bias. That is, comparing $var(hat{beta_1})$ and $var(tilde{beta_1})$ using the equations

var(hat{beta_1}) = sigma^2/[SST_1(1-R_1^2)]

var(tilde{beta_1}) = sigma^2/SST_1.

The full argument for the omitted-variable case comes from Wooldridge's text.

Since having an omitted variable is sufficient to violate $cov(e,x) = 0$, is the argument given by Wooldridge sufficient to prove that the variance is less than it would be if $cov(e,x) = 0$ held true?

(If my understanding is correct, I think that $tilde{beta_1}$ is the assumption-violating case. $hat{beta_1}$ is the 'true' case.)

I make some additional assumptions and simplify notations, nothing of which should cause confusion. Suppose for simplicity that data is generated according to $Y = beta X + epsilon$, where all variables are $mathbb R-$valued and $epsilon$ has zero mean and variance $sigma^2$. Assume $X$ has the necessary moments. We have $n$ independent copies of the pair $(Y, X)$; $x = [x_1, dots, x_n]'$ and $y = [y_1,dots, y_n]'$.

The OLS estimator of $beta$ is

begin{align} hat{beta} &=y'x / x'x \ &= (xbeta + e)'x/x'x \ & = beta + frac{e'x}{x'x} \ &= beta + frac{frac{1}{n}sum_i epsilon_i x_i}{frac{1}{n}sum_i x_i^2} end{align}

where $e = [epsilon_1, dots, epsilon_n]'$. The assertion that this approaches $beta + {rm Cov}(X, epsilon) / {rm Var}(X)$ as $ntoinfty$ is usually in the sense of convergence in probability. According to standard definitions, an estimator is consistent if it converges in probability to the true parameter value, i.e. in this case if $hat{beta} to beta$ in probability. Here, we had an extra term, in general non-zero, on the right hand side so the estimator is inconsistent.

On the other hand, we say that $hat{beta}$ is unbiased if $mathbb E hat{beta} = beta$. This statement has nothing to do with convergence. All the same, the expectation of the right hand side is again $beta$ + some term which is not zero in general. Thus, $hat{beta}$ is also biased. This answers the first question.

Regarding the second question, notice that bias is usually regarded as a property of estimators, and $beta$ is unknown so it's not an estimator. Therefore, it does not make sense to speak of the bias of $beta$. If we are liberal in the usage of bias and let it apply to anything, we see that $mathbb E beta = beta$ so it would be "unbiased".

Without further assumptions on the dependence between $X$ and $epsilon$ there really isn't any way to tell what the answer to three is, I believe. I should say I did not have time to go through the calculations so I may be wrong.

In light of the discussion under the other answers: If one declares a new definition of consistency in terms of the variance approaching zero, I don't see how inconsistency under the assumption ${rm Cov}(X, epsilon) neq 0$ can be either confirmed or disproved without more assumptions. It's also important to note that consistency in the standard definition says nothing about decreasing variance or decreasing bias. These are separate concepts and should not be mixed up.

Similar Posts:

Rate this post

Leave a Comment