I have two 4×4 contingency tables for frequency data. They are based on the same type of sampling criteria of a number of discrete variables but for two condition (before and after). I would like to compare these statistically to see how much – or not – they differ.

A Chi square related test seems appropriate but normally this gives a result in comparison to the theoretical to calculate the statistic. So in other words I need to swap the theoretical for the second table. Of course it doesn't have to be a basic chi square test – any other appropriate test would be ok.

I have access to XLSTAT, Excel and SPSS. And would appreciate some help on this.

**Contents**hide

#### Best Answer

Did you ever figure this out? I am working on the same problem. I can't find any references. This is different from a traditional Chi-square test, in which you have one table (of arbitrary size) and you compare the observed frequencies to what would be expected under independence. So it is an independence test on the two variables.

Let A,B be the two variables in the contingency table, and C be which table (here it has two levels). Here you have two contingency tables (that are independent of each other), and (as I understand the problem), you want to know if the two tables have the same distribution. That is, it's possible that in each table (conditional on the value of C), A and B are NOT INDEPENDENT (that is, a chi-square test would reject that hypothesis), but they way they are not independent is the same regardless of C.

I think here is how to do it: Let the first table (C=1) be the baseline. Do a sort of chi-square test on the second (C=2), where each cell's expected count is not the value under independence, but rather what would be expected if the same relative frequency as in table 1 happened. So it's sort of a test of differences of paired data, where the pairs are the cell frequencies. But the chi-square element comes into play because in each table, the total count is fixed, so you have to reduce the degrees of freedom in the same way (Rows -1)*(columns -1) as you do in chi-square.

I found this reference online (http://web.ntpu.edu.tw/~cflin/Teach/Cate/06CateUEN05ThreeWayPPT.pdf) for 3-way contingency tables, which is what @whuber proposes. This may work as well. Page 23 (using his numbering) mentions complete independence of A,B,C. I think you may want the joint independence of C (here representing the index of the table) on page 29. This says that the frequencies (A,B), that is, the cell counts in each table, are independent of C (that is, which table they are in), which is another way of saying that whatever relationship holds between A,B, that is the same in each table. I think it's similar to what I suggest above; where the expected values are y_{ij+} (that is the total cell count of A=i,B=j across the tables, multiplied by y_{++k} (the total count in each table), divided by n (the total in all tables). Page 32 says this amounts to a chi-square test of a new variable that has values every combination of A,B, versus C, the index of the table. I think the degrees of freedom would be (#A#B -1)*(#C-1), since for each table, you have #A#B combinations and the total in each table is fixed, so that's #A#B -1 dof.