# Solved – In linear regression, are the noise terms independent of the coefficient estimators

In the Wikipedia article on the bias-variance tradeoff, the independence of the estimator $$hat f(x)$$ and the noise term $$epsilon$$ is used in a crucial way in the proof of the decomposition of the mean square error. No justification for this independence is given, and I can't seem to figure it out. For example, if $$f(t)=beta_0t + beta_1$$, $$Y_i=f(x_i) + epsilon_i$$ ($$i=1,ldots,n$$), and $$hat f(x)=hatbeta_0 + hatbeta_1 x$$ as in simple linear regression, are the $$epsilon_i$$ independent of $$hatbeta_0$$ and $$hatbeta_1$$?

Contents

No, they're not independent: In multiple linear regression the OLS coefficient estimator can be written as:

begin{equation} begin{aligned} hat{boldsymbol{beta}} &= (mathbf{x}^text{T} mathbf{x})^{-1} (mathbf{x}^text{T} mathbf{y}) \[6pt] &= (mathbf{x}^text{T} mathbf{x})^{-1} mathbf{x}^text{T} (mathbf{x} boldsymbol{beta} + boldsymbol{varepsilon}) \[6pt] &= boldsymbol{beta} + (mathbf{x}^text{T} mathbf{x})^{-1} mathbf{x}^text{T} boldsymbol{varepsilon}. \[6pt] end{aligned} end{equation}

In regression problems we analyse the behaviour of the quantities conditional on the explanatory variables (i.e., conditional on the design matrix $$mathbf{x}$$). The covariance between the coefficient estimators and errors is:

begin{equation} begin{aligned} mathbb{Cov} ( hat{boldsymbol{beta}}, boldsymbol{varepsilon} |mathbf{x}) &= mathbb{Cov} Big( (mathbf{x}^text{T} mathbf{x})^{-1} mathbf{x}^text{T} boldsymbol{varepsilon}, boldsymbol{varepsilon} Big| mathbf{x} Big) \[6pt] &= (mathbf{x}^text{T} mathbf{x})^{-1} mathbf{x}^text{T} mathbb{Cov} ( boldsymbol{varepsilon}, boldsymbol{varepsilon} | mathbf{x} ) \[6pt] &= (mathbf{x}^text{T} mathbf{x})^{-1} mathbf{x}^text{T} mathbb{V} ( boldsymbol{varepsilon} | mathbf{x} ) \[6pt] &= sigma^2 (mathbf{x}^text{T} mathbf{x})^{-1} mathbf{x}^text{T} boldsymbol{I} \[6pt] &= sigma^2 (mathbf{x}^text{T} mathbf{x})^{-1} mathbf{x}^text{T}. \[6pt] end{aligned} end{equation}

In general, this covariance matrix is a non-zero matrix, and so the coefficient estimators are correlated with the error terms (conditional on the design matrix).

Special case (simple linear regression): In the special case where we have a simple linear regression with an intercept term and a single explanatory variable we have design matrix:

$$mathbf{x} = begin{bmatrix} 1 & x_1 \[6pt] 1 & x_2 \[6pt] vdots & vdots \[6pt] 1 & x_n \[6pt] end{bmatrix},$$

which gives:

begin{equation} begin{aligned} (mathbf{x}^text{T} mathbf{x})^{-1} mathbf{x}^text{T} &= begin{bmatrix} n & & sum x_i \[6pt] sum x_i & & sum x_i^2 \[6pt] end{bmatrix}^{-1} begin{bmatrix} 1 & 1 & cdots & 1 \[6pt] x_1 & x_2 & cdots & x_n \[6pt] end{bmatrix} \[6pt] &= frac{1}{n sum x_i^2 – (sum x_i)^2} begin{bmatrix} sum x_i^2 & & -sum x_i \[6pt] -sum x_i & & n \[6pt] end{bmatrix} begin{bmatrix} 1 & 1 & cdots & 1 \[6pt] x_1 & x_2 & cdots & x_n \[6pt] end{bmatrix} \[6pt] &= frac{1}{n sum x_i^2 – (sum x_i)^2} begin{bmatrix} sum x_i(x_i-x_1) & cdots & sum x_i(x_i-x_n) \[6pt] -sum (x_i-x_1) & cdots & -sum (x_i-x_n) \[6pt] end{bmatrix}. \[6pt] end{aligned} end{equation}

Hence, we have:

begin{equation} begin{aligned} mathbb{Cov}(hat{beta}_0, varepsilon_k) &= sigma^2 cdot frac{sum x_i(x_i-x_k)}{n sum x_i^2 – (sum x_i)^2}, \[10pt] mathbb{Cov}(hat{beta}_1, varepsilon_k) &= – sigma^2 cdot frac{sum (x_i-x_k)}{n sum x_i^2 – (sum x_i)^2}. \[10pt] end{aligned} end{equation}

We can also obtain the correlation, which is perhaps a bit more useful. To do this we note that:

$$mathbb{V}(varepsilon_k) = sigma^2 quad quad quad mathbb{V}(hat{beta}_0) = frac{sigma^2 sum x_i^2}{n sum x_i^2 – (sum x_i)^2} quad quad quad mathbb{V}(hat{beta}_1) = frac{sigma^2 n}{n sum x_i^2 – (sum x_i)^2}.$$

Hence, we have correlation:

begin{equation} begin{aligned} mathbb{Corr}(hat{beta}_0, varepsilon_k) &= frac{sum x_i(x_i-x_k)}{sqrt{(sum x_i^2)(n sum x_i^2 – (sum x_i)^2)}}, \[10pt] mathbb{Corr}(hat{beta}_1, varepsilon_k) &= – frac{sum (x_i-x_k)}{sqrt{n(n sum x_i^2 – (sum x_i)^2)}}. \[10pt] end{aligned} end{equation}

Rate this post