# Solved – Why does the centered variable not have zero mean

It is well established that centering a variable, i.e. subtracting the mean of that variable from every value produces a variable with zero mean. For example:

``> data = c(1,2,3,4,5,6,7,8,9) > mean = mean(data) > data.centered = data-mean > mean(data.centered)  0 ``

So far, so good. However, an attempt at centering a logged variable from my dataset produces a mean that is close to zero, but not exactly zero:

`` -0.0000000000000004258896 ``

This is puzzling. I have two questions:

1) why is the mean not exactly zero?

2) is the fact that the mean is not exactly zero a problem for calculating regression interactions?

Contents

This is a result of a numerical error. Computers have limited precision and such errors are normal. It is easy to understand if you realize how mean is computed:

``sum = 0 for i = 0..N     sum += data[i] mean = sum / N ``

Assuming we are talking about floating point numbers, which are stored in the memory as an exponent and a mantissa. As you sum numbers, the variable `sum` becomes large and its exponent grows. It can happen that the numbers `data[i]` you are adding are simply too small to change the mantissa anymore. This is a common source of the numerical errors.

Another thing is that many numbers cannot be expressed exactly in the computers (for example `0.1`).

For better explanation on how numerical errors work see this answer

Practically, such a small error as the number you posted should not cause problems for you, unless your data are of the same small magnitude (as pointed out in the comment of @jbowman).

Rate this post