Solved – use the chi-squared test of independence with skewed data

I have two variables, both categorical, one with skewed responses. How do you deal with skewed data in the chi-squared test? Are there any other relevant tests? I want to perform the test in SPSS.

For a categorical variable, skewness is not well defined (arguably: not even defined) unless the variable is ordered, as shuffling the categories would change the apparent shape of the distribution.

Even with an ordered categorical variable, any use of numerical values such as 1 to 5 is still no more than a convention, and skewness would vary with many acceptable recodings: e.g. squaring to 1, 4, 9, 16, 25 preserves order but certainly not skewness.

I guess what you mean is that some cell frequencies are much higher than others. That is not in itself a problem except that you should watch out for very low expected frequencies in a chi-square test. Older literature was more timid on this point than more recent literature, but still watch out for expected frequencies $<1$.

There is no special action needed here except to watch out. In extreme cases, you might need to aggregate categories, or you might decide that a chi-square test was not helpful or appropriate.

What software you are using or prefer to use is incidental to this point.

Similar Posts:

Rate this post

Leave a Comment