I am having trouble understanding the estimation of an AR process. In some textbooks, the AR(1) process is defined as follows: $y_{t}=theta y_{t-1}+ϵ_t$ (which does not contain a constant). So the OLS estimator is biased. I am confused about the cause of the bias. It is explained that $y_{t-1}$ is dependent on $ϵ_{t-1}$ although it is independent of $ϵ_t$. However in linear regression, if the equation does not contain a constant, we cannot make sure the expectation of disturbance is zero. So the OLS estimator is bias without a constant. Does it mean that the OLS estimator is unbiased if I add a constant in AR process?
Best Answer
In the AR(1) model $y_{t}=theta y_{t-1}+ϵ_t$, with, say $y_0 = tilde y$, the OLS estimator is
$$hat theta_{OLS} = frac {sum_{t=1}^Ty_{t-1}y_t}{sum_{t=1}^Ty_{t-1}^2} = frac {sum_{t=1}^Ty_{t-1}(theta y_{t-1}+ϵ_t)}{sum_{t=1}y_{t-1}^2} = theta + frac {sum_{t=1}^Ty_{t-1}ϵ_t}{sum_{t=1}^Ty_{t-1}^2}$$
So
$$E[hat theta_{OLS}]-theta = Eleft(frac {tilde yϵ_1}{sum_{t=1}^Ty_{t-1}^2}+frac {y_1ϵ_2}{sum_{t=1}^Ty_{t-1}^2}…+frac {y_{T-1}ϵ_T}{sum_{t=1}^Ty_{t-1}^2}right)$$ The usual way of proving/examining unbiasedness when regressors are stochastic, is to use the Law of Iterated Expectations and calculate the expected value of the estimator conditional on the regressors: $$E[hat theta_{OLS}]-theta = Eleft[Eleft(frac {tilde yϵ_1}{sum_{t=1}^Ty_{t-1}^2}+frac {y_1ϵ_2}{sum_{t=1}^Ty_{t-1}^2}…+frac {y_{T-1}ϵ_T}{sum_{t=1}^Ty_{t-1}^2}mid {y_0, y_1,…,y_{T-1}}right)right]$$
All terms are included in the conditioning set (are "measurable" with respect to it) except the last one so we have
$$E[hat theta_{OLS}]-theta = Eleft[frac {tilde yϵ_1}{sum_{t=1}^Ty_{t-1}^2}+frac {y_1ϵ_2}{sum_{t=1}^Ty_{t-1}^2}…+frac {y_{T-1}Eleft(ϵ_Tmid {y_0, y_1,…,y_{T-1}}right)}{sum_{t=1}^Ty_{t-1}^2}right]$$
$$E[hat theta_{OLS}]-theta = Eleft[frac {tilde yϵ_1}{sum_{t=1}^Ty_{t-1}^2}+frac {y_1ϵ_2}{sum_{t=1}^Ty_{t-1}^2}…+frac {y_{T-2}ϵ_{T-1}}{sum_{t=1}^Ty_{t-1}^2}right] + 0$$
so we are back to unconditional expectations. But
$$Eleft[frac {tilde yϵ_1}{sum_{t=1}^Ty_{t-1}^2}right] neq frac {Eleft[tilde yϵ_1right]}{Eleft[sum_{t=1}^Ty_{t-1}^2right]}=0$$
because $epsilon_1$ is included also in the denominator, and the expected value cannot be applied separately. Likewise for the other expected values. So the bias is non-zero.
Matters get worse when a constant term is included, since then the sample averages of the dependent variable enter also the formula for the OLS estimator, strengthening the correlation that leads to the existence of bias.
If in the above expressions, instead of $y_{t-1}$ we write $x_t$ for some regressor that is assumed independent of all disturbances i.e. not only the concurrent one, then we can see why unbiasedness holds -and this is the assumption (stict exogeneity) that makes it happen, nothing less than that. From these expressions also one can see that assuming only that $x_t$ is independent of $epsilon_t$, but possibly dependent of past disturbances, is not enough to make the estimator unbiased.
Similar Posts:
- Solved – If I prove the estimator of $theta^2$ is unbiased, does that prove that the estimator of parameter $theta$ is unbiased
- Solved – If I prove the estimator of $theta^2$ is unbiased, does that prove that the estimator of parameter $theta$ is unbiased
- Solved – Does the second moment estimator of the uniform distribution parameter have the same properties as that of the first moment
- Solved – Find UMVUE of $theta$ where $f_X(xmidtheta) =theta(1 +x)^{−(1+theta)}I_{(0,infty)}(x)$
- Solved – Find UMVUE of $theta$ where $f_X(xmidtheta) =theta(1 +x)^{−(1+theta)}I_{(0,infty)}(x)$