Scott's and Freedman–Diaconis rules of the thumb are based on the following formula:
In the article here it is stated that:
While these appear to be useful estimates for unimodal densities
similar to a Gaussian distribution, they are known to be
suboptimal for multimodal densities.
My question is, will Scott and Freedman–Diaconis rules of the thumb estimate the correct number of bins on distributions with more than one peak?
What are the disadvantages of these methods, and how to overcome them?
Best Answer
Comment continued. Here is a mixture of three normal samples (each of size 50) with means sufficiently far apart, relative to their standard deviations, to show separate modes. The default binning in R provides a histogram that does find the modes. The default KDE in R (with the default bandwidth) roughly matches the three modes (at 12, 18, and 25).
set.seed(930) x = cbind(rnorm(50,12,2), rnorm(50,18,2), rnorm(50,25,2)) hist(x, prob=T, col="skyblue2"); rug(x) lines(density(x), col="red", lwd=2)
Similar Posts:
- Solved – Rules of Thumb to choose an initial number of class intervals and refine that choice (potentially automatically)
- Solved – Rules of Thumb to choose an initial number of class intervals and refine that choice (potentially automatically)
- Solved – How to use the “break” option when making histograms?
- Solved – Rounding in Sturges’ formula
- Solved – Basic easy rules for statistics