I am confused with multiple comparisons adjustments. I have a $p$-values with lot of ones ( due to many scores in foreground are 0) from a fisher-exact test. I get some $p$-values which are significant without multiple testing correction. The $p$-value compose of 1000 $p$-values of
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000013 0.2552000 0.6069000 0.5634000 0.8672000 .9900000
and 3000 $p$-values of 1. The $p$-values is present at https://dl.dropboxusercontent.com/u/2706915/pval.csv
If I remove all $p$-values=1 and perform multiple testing correction. I expected by adding these $p$-values=1; $q$-value will increase since distribution of $p$-value is shifting left. However, R-package pvalue functions are giving q-value=1 for all p-values. I cannot understand this behavior. The FDR assumes that p-value distribution is uniform that is not in my cases. What mistake I am making?.
Best Answer
A point by point response to your questions:
You do not say what kind of test-statistic your $p$-values apply to. If you are talking about continuous distributions, such as for t or z statistics, then technically all of your $p$-values are strictly less than 1, although some of them may be very close to 1.
You test a bunch of hypotheses, and some of them are significant (without multiple comparisons adjustments), and some of them are not. Great.
Generally, one does not need to remove any $p$-values prior to conducting multiple comparisons adjustments for step-wise adjustment procedures (although the FDR gives the same results for a given level of $alpha$). All but one adjusted $p$-value (i.e. $q$-values) will be always larger than the corresponding unadjusted $p$-value. Conversely, one can think of multiple comparisons adjustments as adjusting the rejection-probability (e.g. $alpha$), rather than adjusting $p$-values, and here all but one of the adjusted rejection probabilities are less than the nominal type 1 error rate. One advantage to working the math out this way is one never has to adjust $p$-values so that they are larger than/truncated at the value 1.
It sounds like, after adjustment for multiple comparisons using the FDR, you would not reject any hypotheses. This is a possibility (without seeing your vector of $p$-values it is not possible to show you the math).
The FDR does not assume a uniform distribution of $p$-values.
You are seemingly not making any mistake, other than being surprised by your results versus your expectations of your results.
Update: Have a look at this spreadsheet producing both adjusted alpha (i.e. the FDR), and alternatively adjusted $p$-values, for the 927 $p$-values in the spreadsheet you supplied.
Notice that: (1) column B contains the $p$-values <1 sorted largest to smallest; (2) column C contains the sorting order ($i$), (3) the adjusted $frac{alpha}{2} = frac{0.05}{2}timesfrac{927+1-i}{927}$, (4) the adjusted $p$-values $=frac{927}{927+1-i}p_{i}$, and finally, (5) you would reject the hypotheses corresponding to the two smallest $p$-values because (a) $3.78times 10^{-5} < 5.39times 10^{-5}$ (i.e. $p_{926} < alpha_{926}^{*}$), or alternately (b) $0.0175 < 0.025$ (i.e. $q_{926} < frac{alpha}{2}$).
Similar Posts:
- Solved – Why are the q-values equal to 1 after adjusting for multiple testing via the FDR
- Solved – Which Adjusted P value is the most accurate one (“fdr”,”none”,”bonferroni”)
- Solved – Dunn.test in R: are adjusted p-values corrected for number of comparisons
- Solved – Rejection threshold of the Benjamini-Hochberg procedure
- Solved – Significance level of kwallis2 test in Stata