Solved – Sum-to-zero constraint in one-way ANOVA

I'm trying to understand my lecture notes but am a bit stuck on the concept of identifiability.
In one-way ANOVA, could someone please explain the reason for the constraint $sum_{i=1}^{m} beta_{j} = 0$ where we have m groups of observations, each group consisting of k observations with $Y_{ij}$ as the jth observation from the ith group, $E(Y_{ij}) = mu + beta_{i}, i = 1,…,m; j = 1,…,k, text{Var}(Y_{ij}) = sigma^{2}$and$ H_{0} : beta_{1} = beta_{2} = … = beta_{m}$? I don't quite get the identifiability reason.

Consider for simplicity that $m=2$ and compare the models

  • $mu=0,beta_1=0,beta_2=2$,

  • $mu=1,beta_1=-1,beta_2=1$,

  • $mu=2,beta_1=-2,beta_2=0$.

These models are all special cases of $(mu,beta_1,beta_2)=(mu,-mu,2-mu)$. You can see that whatever $mu$ we choose, $mu+beta_1=0$ and $mu+beta_2=2$, so there's an infinite set of parameter-triples that match $E(Y_{1j})=0$ and $E(Y_{2j})=2$, and no way to distinguish between them.

Consequently, while data will allow you to estimate the two group-means, those two pieces of information (two df) – no matter how precisely estimated – are not going to be enough to estimate the three parameters (three df) in the model — there's an extra degree of freedom that allows you to move all three parameters in particular ways relative to each other while keeping the group-means the same.

You need to restrict/constrain/regularize the situation in some way so that the model doesn't have more things to estimate than the design has the ability to identify.

Similar Posts:

Rate this post

Leave a Comment