# Solved – Sum-to-zero constraint in one-way ANOVA

I'm trying to understand my lecture notes but am a bit stuck on the concept of identifiability.
In one-way ANOVA, could someone please explain the reason for the constraint \$sum_{i=1}^{m} beta_{j} = 0\$ where we have m groups of observations, each group consisting of k observations with \$Y_{ij}\$ as the jth observation from the ith group, \$E(Y_{ij}) = mu + beta_{i}, i = 1,…,m; j = 1,…,k, text{Var}(Y_{ij}) = sigma^{2}\$and\$ H_{0} : beta_{1} = beta_{2} = … = beta_{m}\$? I don't quite get the identifiability reason.

Contents

Consider for simplicity that \$m=2\$ and compare the models

• \$mu=0,beta_1=0,beta_2=2\$,

• \$mu=1,beta_1=-1,beta_2=1\$,

• \$mu=2,beta_1=-2,beta_2=0\$.

These models are all special cases of \$(mu,beta_1,beta_2)=(mu,-mu,2-mu)\$. You can see that whatever \$mu\$ we choose, \$mu+beta_1=0\$ and \$mu+beta_2=2\$, so there's an infinite set of parameter-triples that match \$E(Y_{1j})=0\$ and \$E(Y_{2j})=2\$, and no way to distinguish between them.

Consequently, while data will allow you to estimate the two group-means, those two pieces of information (two df) – no matter how precisely estimated – are not going to be enough to estimate the three parameters (three df) in the model — there's an extra degree of freedom that allows you to move all three parameters in particular ways relative to each other while keeping the group-means the same.

You need to restrict/constrain/regularize the situation in some way so that the model doesn't have more things to estimate than the design has the ability to identify.

Rate this post