# Solved – Checking for a statistically significant peak

I have a set of data, \$y\$ and \$x\$. I would like to test the following hypothesis: There is a peak in \$y\$; that is as \$x\$ increases, \$y\$ first increases and then decreases.

My first idea was fitting \$x\$ and \$x^2\$ in a SLR. That is, if I find that the coefficient before \$x\$ is significantly positive and the coefficient before \$x^2\$ is significantly negative, then I have support for the hypothesis. However, this only checks for one type of relationship (quadratic) and may not necessarily capture the existence of the peak.

Then I thought of finding \$b\$, such a region of (sorted values of) \$x\$, that \$b\$ is between \$a\$ and \$c\$, two other regions of \$x\$ that contain at least as many points as \$b\$, and that \$bar{y_b}>bar{y_a}\$ and \$bar{y_b}>bar{y_c}\$ significantly. If the hypothesis is true, we should expect many such regions \$b\$. Thus, if the number of \$b\$ is sufficiently large, there should be support for the hypothesis.

Do you think I am on the right track to find a suitable test for my hypothesis? Or am I inventing the wheel and there is an established method for this problem? I will greatly appreciate your input.

UPDATE. My dependent variable \$y\$ is count (non-negative integer).

Contents

I was thinking of the smoothing idea also. But there is a whole area called response surface methodology that searches for peaks in noisy data (it does primarily involve using local quadratic fits to the data) and there was a famous paper I recall with "Bump hunting" in the title. Here are some links to books on response surface methodology. Ray Myer's books are particularly well-written. I will try to find the bump hunting paper.

Response Surface Methodology: Process and Product Optimization Using Designed Experiments

Response Surface Methodology And Related Topics

Response surface methodology

Empirical Model-Building and Response Surfaces

Although not the article I was looking for, here is a very relevant article by Jerry Friedman and Nick Fisher that deals with these ideas applied to high-dimensional data.

Here is an article with some online comments.

So I hope you at least appreciate my response. I think your ideas are good and on the right track but yes I do think you might be reinventing the wheel and I hope you and others will look at these excellent references.

Rate this post