Let's assume there is a time series Y of length n with Y(1) being the most recent observation.

In MA process we plot auto-correlation function (ACF) to see how many lags to use. If we look at MA(3) process for example that implies the correlation between:

`Y(1:n-1) and Y(2:n) Y(1:n-2) and Y(3:n) Y(1:n-3) and Y(4:n) `

are significant. So shouldn't MA process look something like:

`Y(t) = a0 + a1.Y(t-1) + a2.Y(t-2) + a3.Y(t-3) `

instead of:

`Y(t) = a0 + a1.ep(t-1) + a2.ep(t-2) + a3.ep(t-3) where ep ~ N(0,1) `

After all we see a direct correlation between Y(t) and it's lagged observation.

**Contents**hide

#### Best Answer

Let's assume there is a time series Y of length n with Y(1) being the most recent observation.

This is the reverse of the normal notation, where the time-index doesn't run backwards like this. Your notation would result in relabelling *every* observation when you got a new data point. If you're going to reverse the normal convention you should only do it with a good reason. Is there one?

(However, in spite of this, the rest of your post appears to follow convention, so perhaps that was just a typo.)

In MA process we plot auto-correlation function (ACF) to see how many lags to use.

Correct, but this doesn't define the MA.

If we look at MA(3) process for example that implies the correlation between Y(1:n-1) and Y(2:n), Y(1:n-2) and Y(3:n), Y(1:n-3) and Y(4:n) are significant.

No, it implies the corresponding *population* correlations are nonzero, not that the sample correlations are significant. That's really a matter of sample size and how big the correlations are.

So shouldn't MA process look something like Y(t) = a0 + a1.Y(t-1) + a2.Y(t-2) + a3.Y(t-3) instead of:

Y(t) = a0 + a1.ep(t-1) + a2.ep(t-2) + a3.ep(t-3) where ep ~ N(0,1)

That's an AR (except it's missing an error term). Both AR and MA processes can exhibit correlations at multiple lags. However, an AR(3) (or, for that matter, an AR(1)) will tend to exhibit an autocorrelation function that goes on for many periods *after* lag 3, while an MA(3) will show a population ACF that 'cuts off' at lag 3 (is zero after lag 3) – and a sample ACF that should look like 'noise' after lag 3 (more precisely, small autocorrelations consistent with random noise).

After all we see a direct correlation between Y(t) and its lagged observation.

Consider an AR(1):

$Y_t = phi_0 + phi_1 Y_{t-1} + varepsilon_t$

$text{ACF}_Y(t) = phi^t$

If your ACF cuts out at lag 3 it suggests an MA, not an AR.

Here are ACFs for an AR(1) and an MA(1). Note that the ACF for the AR(1) (Y in terms of previous Y's) decays geometrically, while the ACF for the MA(1) 'cuts off' abruptly:

An MA(3) cuts off at lag 3, and AR(3) still shows the geometric decay (this particular AR decays slowly because it's close to the boundary of stationarity; some other ARs will decay more quickly)

### Similar Posts:

- Solved – How many lags for ADF test based on ACF, PACF
- Solved – Intuition for auto-correlation for mean reverting process
- Solved – Intuition for auto-correlation for mean reverting process
- Solved – Intuition for auto-correlation for mean reverting process
- Solved – Intuition for auto-correlation for mean reverting process