# Solved – Density estimation and histograms

This is an excerpt from BW Silverman's Density Estimation for Statistics and Data Analysis:

The oldest and most widely used density estimator is the histogram.
Given an origin \$x_0\$ and a bin width of \$h\$, we define the bins of the
histogram to be the intervals \$[x_0+mh,x_0+(m+1)h]\$ for integers \$m\$. The
histogram is defined by

\$\$hat{f} = dfrac{1}{nh}(text{no. of \$X_i\$ in the same bin as
> \$x\$})\$\$

Since I am seeing \$hat{f}\$ , does this mean that we are talking about an estimator? I also did not understand the formula. I am not sure about the switch from \$m\$ to \$n\$, I just assume that they are the same thing.

Contents

As @tristan comments, \$m\$ is a counter integer, while \$n\$ is the total number of data points in the sample, and \$h\$ is the histogram bin width. The formula is correct.

It may be easier to understand if you consider the case where you have \$M\$ bins and the same number of data points \$frac{n}{M}\$ in each bin. Then your histogram height will be the same for each bin, \$hat{f}=frac{1}{Mh}\$. So you have \$M\$ bins, each of width \$h\$ and height \$hat{f}=frac{1}{Mh}\$, for a total area of 1. As a density should be.

In fact, if you count the data points in each bin, you will find that your histogram always has a total area of 1. Again: this is just what a density should be.

And yes, \$hat{f}\$ is an estimator. It is an estimate of the density, in the space of step functions. You can approximate most "normal" functions using step functions (in the sense that the integral over the absolute difference between the step function and the function to be approximated goes to zero as \$hto 0\$), so step functions are a logical simple approximation.

In fact, histograms can be seen as related to kernel density estimators, with "kernels" that don't only depend on \$frac{x-x_i}{h}\$, but additionally on \$x\$: i.e., "counting kernels" that count how many \$x_i\$ fall into the interval (bin) containing \$x\$. This is a somewhat contrived way of looking at histograms, but I actually find it a bit enlightening.

Rate this post