I ran Tassel3 and I filtered results with p-value
not more than 0.05. This way, it is ok to draw a Manhattan plot. However, for
a QQ-plot there is a problem.
Say I have 40 thousands SNPs, after filtering, about 1,800 SNPs were
kept in the output file. Now how do I draw QQ-plot because of partial p-values?
ppoints
only generate points within (0,1). I think the resulting points
should be in the filtered range (0,0.05) with the same number (1,800) too.
runif
can do the trick, but it is random, each time it generates different
results. I think this would affect the QQ-plot line.
I figured out three ways to solve it but I am not sure if it is right.
Hope you guys can give some advices.
the following are three methods:
0.05*ppoint(1800)
ppoints(1800*100/0.05, a=0.05)
, then I only get the former 1/20 results.ppoints(40k)
, and use those p-values below 0.05 in plotting.
Best Answer
You are right not to use random points. Version 3 will provide what you want. But as ppoints(40k)
is just producing 40k equally-spaced values between 0 and 1, it's not too difficult to directly code what you want. To bypass the use of ppoints()
altogether, use e.g. (-0.5+1:1800)/40000
; this gives the expected position (under the null) of the 1800 smallest p-values, from a sample of 40000 p-values.
Similar Posts:
- Solved – Empirical verification of the probability integral transform
- Solved – Generate QQ plot for sets of different size
- Solved – Deflated QQ plots in genome-wide association studies
- Solved – Deflated QQ plots in genome-wide association studies
- Solved – Understanding of quantile plot versus remove outliers plot