I want to perform a test to compare education level of living people and education level of death people. I got the data of education level for living people from the census of 2010. The data of education level for death people from the CDC. The data looks like this…

## Living people education level

`+-----------------------------------+ | GROUP | TOTAL_COUNT | MEAN | +-----------------------------------+ | 11- | 28587748 | 13.7 | +-----------------------------------+ | Just 12 | 58440600 | 28.0 | +-----------------------------------+ | Some College | 61206147 | 29.1 | +-----------------------------------+ | Bachelor+ | 60821634 | 29.3 | +-----------------------------------+ `

## Death people education level

`+----------------------------------------+ | GROUP | TOTAL_COUNT | MEAN | +----------------------------------------+ | 11- | 624934 | 32.887178 | +----------------------------------------+ | Just 12 | 784319 | 41.274821 | +----------------------------------------+ | Some College | 256255 | 13.485430 | +----------------------------------------+ | Bachelor+ | 234728 | 12.352571 | +----------------------------------------+ `

I want to find out the significant difference (t-test, ANOVA, etc) between similar groups. For instance; the significant difference between Group 11- of living people and Group 11- of death people. I thought about a t-test or ANOVA but since my values are counts I can't calculate the standard deviation because is not one.

What test I can perform to find out how significant is the difference between GROUP 11- of living people and 11- of death people. I want to perform the test for each group.

I'm utilizing jupyter notebook.

#### Best Answer

I think you want to run multiple two-samples proportion test.

To compare `11-`

for both samples, you can run the proportion test with 0.137 and 0.32887. Your sample sizes will be all the living people and all the dead people.

http://stattrek.com/hypothesis-test/difference-in-proportions.aspx?Tutorial=AP

However, I believe the tests will give you significance results at both 10% and 1% significance level because you have very large samples.

### Similar Posts:

- Solved – the difference between total follow-up time and time to death (in years)
- Solved – Comparing mean difference of categorical variables
- Solved – Comparing mean difference of categorical variables
- Solved – How to calculate the average age of a subgroup
- Solved – How to compare proportions across different groups with varying population sizes