# Solved – study on multiple outcomes- do I adjust or not adjust p-values

I did a study comparing 2 groups on multiple outcomes/characteristics. I am still learning the ropes when it comes to statistics, so I failed to specify to adjust the p-values. Some of the results that came back were significant with unadjusted p-values in the range of p=.00x and p=.0x (less than .05).

My question is: should I leave the results as is and just report the p-values as unadjusted?

Or if I need to re-run tests with p-value adjustment , my question is this: The first time around, I included subcategories and subtests so the number of outcomes measured had ballooned. If I need to re-run the analysis with an adjustment of p-value the second time around, I was planning to run it on only the composite measures, which streamlines the endpoints of interest to less than 10 ( by the way, this includes outcomes that came out both significant and non-significant during the first analysis I did). What method of p-value adjustment would you suggest? I understand that Bonferroni is the most conservative, but are their other adjustment methods that are less punishing?

Contents

There are tons of post-hoc adjustment methods. Whether adjustment is needed or not, and how to adjust is a matter of some debate. You should reach a decision on these issues before conducting your analysis. Contingent decision-making invalidates standard interpretations of p-values (e.g., Wagenmakers, 2007). On a practical basis the "correct" answer depends quite a bit on the conventions in your field of study. I quote extensively from another source below, but I think my short answer would be to use the Holm adjustment. I think theoretically the false-discovery-rate approaches might make more sense, but I don't know how frequently people use them in practice.

Quoting from R's help file on `p.adjust` for a sampling of the available options:

adjustment methods include the Bonferroni correction ("bonferroni") in which the p-values are multiplied by the number of comparisons. Less conservative corrections are also included by Holm (1979) ("holm"), Hochberg (1988) ("hochberg"), Hommel (1988) ("hommel"), Benjamini & Hochberg (1995) ("BH" or its alias "fdr"), and Benjamini & Yekutieli (2001) ("BY"), respectively.

The first four methods are designed to give strong control of the family-wise error rate. There seems no reason to use the unmodified Bonferroni correction because it is dominated by Holm's method, which is also valid under arbitrary assumptions.

Hochberg's and Hommel's methods are valid when the hypothesis tests are independent or when they are non-negatively associated (Sarkar, 1998; Sarkar and Chang, 1997). Hommel's method is more powerful than Hochberg's, but the difference is usually small and the Hochberg p-values are faster to compute.

The "BH" (aka "fdr") and "BY" method of Benjamini, Hochberg, and Yekutieli control the false discovery rate, the expected proportion of false discoveries amongst the rejected hypotheses. The false discovery rate is a less stringent condition than the family-wise error rate, so these methods are more powerful than the others.

References (and commentary on same) from the r-help file:

• Benjamini, Y., and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B, 57, 289–300.
• Benjamini, Y., and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Annals of Statistics 29, 1165–1188.
• Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, 6, 65–70.
• Hommel, G. (1988). A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika, 75, 383–386.
• Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika, 75, 800–803.
• Shaffer, J. P. (1995). Multiple hypothesis testing. Annual Review of Psychology, 46, 561–576. (An excellent review of the area.)
• Sarkar, S. (1998). Some probability inequalities for ordered MTP2 random variables: a proof of Simes conjecture. Annals of Statistics, 26, 494–504.
• Sarkar, S., and Chang, C. K. (1997). Simes' method for multiple hypothesis testing with positively dependent test statistics. Journal of the American Statistical Association, 92, 1601–1608.
• Wright, S. P. (1992). Adjusted P-values for simultaneous inference. Biometrics, 48, 1005–1013. (Explains the adjusted P-value approach.)

Rate this post