I have tried compute the autocovariance of the following process:

$$ Y_t = beta_0+beta_1t+epsilon_t ~~~~~~~~~~~~,~ epsilon_t sim WN(0,sigma^2) $$

I tried this way:

$$COV(Y_t, Y_{t-j}) = E[(Y_t – E(Y_t))(Y_{t-j}- E(Y_{t-j}))]$$

$$ = E[(beta_0+beta_1t+epsilon_t-(beta_0+beta_1t))(beta_0+beta_1(t-j)+epsilon_t-(beta_0+beta_1(t-j)))] $$

$$= E[epsilon_t cdot epsilon_{t-j}] = 0$$

But this is not correct since if I simulate such a process with R and then I compute its acf, it is not zero. Where is my mistake?

#### Best Answer

### The underlying problem is that your series is not stationary!

After you simulate your series $y_t$. R is computing the sample mean over time as:

$$ bar{y} = frac{1}{T} sum_{tau = 1}^T y_tau $$

While with math, you computed the mean of $y_t$ as:

$$ E[y_t] = beta_0 + beta_1 t $$

Observe how completely different these two entities are! Because the process ${y_t}$ is not stationary, your sample mean taken over time is *NOT* an estimate of the population mean!

- The population mean of $y_t$, that is $E[y_t] = beta_0 + beta_1 t$, is a function of time $t$.
- On the other hand, the sample mean $bar{y}$ is not an estimate of anything useful. It's a scalar, it doesn't depend on time $t$, and it goes to infinity as sample size $T$ goes to infinity.

So when R computes sample covariances etc… using $bar{y}$, everything is already all fouled up! Look at how entirely non-sensical the calculation of sample covariance is:

$$hat{gamma}(k) = frac{1}{T-1} sum_{tau } left( y_tau – bar{y} right) left( y_{tau – k} – bar{y} right) $$

This makes sense for a stationary process, but it's useless in this case. Taking averages over time of a non-stationary process is generally a **huge** error.

### If you want to simulate to match your math answers, what can you do?

If you want to verify results such as $E[y_t] = beta_0 + beta_t$ by simulation, you want to conduct multiple draws from sample space $Omega$ rather than advancing time $t$. You want to take averages over space $Omega$ rather than time $t$.

If your process is non-stationary, it's still the case that for any time $t$, $y_t$ is a random variable and you can compute the sample mean etc… of that like any other random variable by taking IID draws from the sample space.

Sample averages over time only converge towards population averages across space in the case of stationary, ergodic processes.