I am trying to perform Fisher test on the following matrix:
269 118 55 48
Several free websites including R fisher.test()
are returning p=0.0033
, while vassarstats.net returns the output from the image.
Am I misinterpreting the data? Why the difference?
Best Answer
It is possible to perform the Fisher exact test manually in R
to see which result is correct. To do this, we need to calculate the probabilities of each possible outcome of a contingency table with the same row and column totals as your table. (For computational reasons discussed here I will use log-probabilities instead.) The two-sided version of Fisher's exact test calculates the p-value as the sum of all the probabilities no greater than the probability of the observed contingency table.
#Input the observed contingency table and set parameters DATA <- matrix(c(269, 118, 55, 48), nrow = 2); m <- sum(DATA[1, 1:2]); n <- sum(DATA[2, 1:2]); k <- sum(DATA[1:2, 1]); maxx <- sum(DATA[1:2, 2]); #Calculate log-probabilities over all possible outcomes LOGPROBS <- rep(0, maxx+1); for (x in 0:maxx) { LOGPROBS[x+1] <- dhyper(DATA[1,1] - DATA[2,2] + x, m, n, k, log = TRUE); } #Calculate p-value for Fisher exact test LOWER <- LOGPROBS[which(LOGPROBS <= LOGPROBS[DATA[2,2]+1])]; P_VALUE <- exp(matrixStats::logSumExp(LOWER)); P_VALUE; [1] 0.003255474
This manual calculation of the test gives the same p-value as the fisher.test
function in R
. I have also checked that the log-probabilities given here yield a total probability of one over all possibilities (to within a very small tolerance). So, based on this investigation, it appears to me that the calculation in R
is correct, which suggests that there is some issue with the calculation at the website resource. (I suggest reading the documentation carefully to see if they are using some approximation method in their calculation.) Note that one possible source of calculation error is from arithmetic underflow problems if you try to compute the Fisher exact p-value without converting to log-probability space.
Similar Posts:
- Solved – Is it inappropriate to use Fisher’s exact test when cell counts are high
- Solved – Assumptions and interpretation of Fisher’s Exact Test
- Solved – 2×2 Fisher Exact Test Contingency Tables
- Solved – In R, fisher.test returns different results if I use vectors vs contingency table
- Solved – Fisher and chi-squared assumptions/limitations not met