Let's say I want to run a difference of proportions test where each side has n=23,000 but their proportions are 0.21% and 0.34%.

` group1 group2 n 23000 23000 x 50 78 prop 0.21% 0.34% `

both `n(p) > 50`

& `n(1-p) > 50`

A standard z-score test will say this difference is significant.

However, my *intuition* tells me the test should not work for such small proportions. If the true proportions were equal, and with such a rare event, I would actually expect to see large differences like this just from sampling variability. Am I right in thinking this? Does the difference of proportions test break down for tiny proportions?

Note: This is a purely hypothetical question. In real life, I don't care that group2 outperformed group1. The event rate is so low that there is little value in using it. In other words, it is statistically significant but not clinically significant.

**Contents**hide

#### Best Answer

Whenever I have doubts about the performance of a particular method, I try to run a simulation study to examine how well the method works under similar conditions. Below is a simple example using R for the case you are describing. Note that I set the true proportions equal for the two groups and to a value that is somewhere in between what you actually observed in the two samples. Therefore, the simulation provides the empirical Type I error rate of the test. It should hopefully be close to .05. Setting the number of iterations large enough will ensure that the simulation error is small. Also, note that I once run the test without and once with Yates' continuity correction to see whether this is relevant here.

`iters <- 100000 n <- 23000 p <- 0.0027 x1i <- rbinom(iters, n, p) x2i <- rbinom(iters, n, p) pval1 <- rep(NA, iters) pval2 <- rep(NA, iters) for (i in 1:iters) { pval1[i] <- chisq.test(matrix(c(x1i[i], n-x1i[i], x2i[i], n-x2i[i]), nrow=2, byrow=TRUE), correct=FALSE)$p.value pval2[i] <- chisq.test(matrix(c(x1i[i], n-x1i[i], x2i[i], n-x2i[i]), nrow=2, byrow=TRUE), correct=TRUE)$p.value } round(mean(pval1 <= .05), 3) round(mean(pval2 <= .05), 3) `

Here are the results from one run:

`> round(mean(pval1 <= .05), 3) [1] 0.05 > round(mean(pval2 <= .05), 3) [1] 0.04 `

So, the test performs nominally when not using Yates' continuity correction. With the correction, the test is slightly conservative.

If you want to find out about the power of the test, you can set the true proportions to two different values and then rerun the simulation.

### Similar Posts:

- Solved – Equivalence test for binominal data
- Solved – simulating a multiple comparisons problem using R and bonferroni correction
- Solved – Why are (almost) all of the corrected (Benjamini-Hochberg) p-values equal
- Solved – test for difference between two differences (proportions)
- Solved – How to test to see if the demographic proportions between two unequally sized groups are statistically different