# Solved – the difference between bias and residuals

I'm aware of the bias variance trade off.
Intuitively I understand how as the model becomes more complex the variance decreases and the bias increases, after a certain point.
But I don't really understand bias.

For example:

If we have a predictor variable x, and we want to estimate a y.

Bias = E[x] – y

residual = x*B – y <=> E[x] – y

Contents

A bias is a property of an estimator or a statistics, NOT of a stochastic realization. It means that an estimator or a statistics is calculated in a way that it is SYSTEMATICALLY different from the quantity that is supposed to summarize / estimate.

These things are NOT examples of bias:

• Residuals for a single experiment
• The difference of a parameter estimate or prediction from the truth for a single experiment (unless it is systematic)
• Anything else that is stochastic and not systematic

The bias variance trade-off is maybe not an ideal name, it should maybe have better been called interpolation/extrapolation trade-off. Anyway, the motivation for the name is that that when adding more parameters / complexity, you have

• Less systematic error (bias) in your model (supposedly, because it is more flexible, I would argue it depends on what you call error / bias)
• More variance in the estimation of the model parameters (because it is more flexible)

Rate this post