Solved – ttest for categorical variable

I want to perform a test to compare education level of living people and education level of death people. I got the data of education level for living people from the census of 2010. The data of education level for death people from the CDC. The data looks like this…

Living people education level

+-----------------------------------+ | GROUP        | TOTAL_COUNT | MEAN | +-----------------------------------+ | 11-          | 28587748    | 13.7 | +-----------------------------------+ | Just 12      | 58440600    | 28.0 | +-----------------------------------+ | Some College | 61206147    | 29.1 | +-----------------------------------+ | Bachelor+    | 60821634    | 29.3 | +-----------------------------------+  

Death people education level

+----------------------------------------+ | GROUP        | TOTAL_COUNT | MEAN      | +----------------------------------------+ | 11-          | 624934      | 32.887178 | +----------------------------------------+ | Just 12      | 784319      | 41.274821 | +----------------------------------------+ | Some College | 256255      | 13.485430 | +----------------------------------------+ | Bachelor+    | 234728      | 12.352571 | +----------------------------------------+  

I want to find out the significant difference (t-test, ANOVA, etc) between similar groups. For instance; the significant difference between Group 11- of living people and Group 11- of death people. I thought about a t-test or ANOVA but since my values are counts I can't calculate the standard deviation because is not one.
What test I can perform to find out how significant is the difference between GROUP 11- of living people and 11- of death people. I want to perform the test for each group.
I'm utilizing jupyter notebook.

Best Answer

I think you want to run multiple two-samples proportion test.

To compare 11- for both samples, you can run the proportion test with 0.137 and 0.32887. Your sample sizes will be all the living people and all the dead people.

http://stattrek.com/hypothesis-test/difference-in-proportions.aspx?Tutorial=AP

However, I believe the tests will give you significance results at both 10% and 1% significance level because you have very large samples.

Similar Posts:

Rate this post

Leave a Comment