Solved – Scott’s and Freedman–Diaconis rules of the thumb for selecting bin width – disatvantages

Scott's and Freedman–Diaconis rules of the thumb are based on the following formula:

In the article here it is stated that:

While these appear to be useful estimates for unimodal densities
similar to a Gaussian distribution, they are known to be
suboptimal for multimodal densities.

My question is, will Scott and Freedman–Diaconis rules of the thumb estimate the correct number of bins on distributions with more than one peak?

What are the disadvantages of these methods, and how to overcome them?

Comment continued. Here is a mixture of three normal samples (each of size 50) with means sufficiently far apart, relative to their standard deviations, to show separate modes. The default binning in R provides a histogram that does find the modes. The default KDE in R (with the default bandwidth) roughly matches the three modes (at 12, 18, and 25).

set.seed(930) x = cbind(rnorm(50,12,2), rnorm(50,18,2), rnorm(50,25,2)) hist(x, prob=T, col="skyblue2"); rug(x) lines(density(x), col="red", lwd=2) 

enter image description here

Similar Posts:

Rate this post

Leave a Comment