It is well established that centering a variable, i.e. subtracting the mean of that variable from every value produces a variable with zero mean. For example:

`> data = c(1,2,3,4,5,6,7,8,9) > mean = mean(data) > data.centered = data-mean > mean(data.centered) [1] 0 `

So far, so good. However, an attempt at centering a logged variable from my dataset produces a mean that is close to zero, but not exactly zero:

`[1] -0.0000000000000004258896 `

This is puzzling. I have two questions:

1) why is the mean not exactly zero?

2) is the fact that the mean is not exactly zero a problem for calculating regression interactions?

**Contents**hide

#### Best Answer

This is a result of a numerical error. Computers have limited precision and such errors are normal. It is easy to understand if you realize how mean is computed:

`sum = 0 for i = 0..N sum += data[i] mean = sum / N `

Assuming we are talking about floating point numbers, which are stored in the memory as an exponent and a mantissa. As you sum numbers, the variable `sum`

becomes large and its exponent grows. It can happen that the numbers `data[i]`

you are adding are simply too small to change the mantissa anymore. This is a common source of the numerical errors.

Another thing is that many numbers cannot be expressed exactly in the computers (for example `0.1`

).

For better explanation on how numerical errors work see this answer

Practically, such a small error as the number you posted should not cause problems for you, unless your data are of the same small magnitude (as pointed out in the comment of @jbowman).

### Similar Posts:

- Solved – Is centering a valid solution for multicollinearity
- Solved – p-values change after mean centering with interaction terms. How to test for significance
- Solved – Calculating threshold turn value using mean centered quadratic terms in regression
- Solved – Mean centering interaction terms
- Solved – Mean centering interaction terms