Solved – the PDF for the minimum difference between a random number and a set of random numbers

I have a list (lets call it $ {L_N} $) of N random numbers $Rin(0,1)$ (chosen from a uniform distribution). Next, I roll another random number from the same distribution (let's call this number "b").
Now I find the element in the list $ {L_N} $ that is the closest to the number "b" and find this distance.

If I repeat this process, I can plot the distribution of distances that are obtained through this process.

When $Nto infty$, what does this distribution approach?

When I simulate this in Mathematica, it appears as though it approaches an exponential function. And if the list was 1 element long, then I believe this would exactly follow an exponential distribution.

Looking at the wikipedia for exponential distributions, I can see that there is some discussion on the topic:

enter image description here

But I'm having trouble interpreting what they are saying here. What is "k" here? Is my case what they are describing here in the limit where $nto infty$?

EDIT: After a very helpful helpful intuitive answer by Bayequentist, I understand now that the behavior as $N to infty$ should approach a dirac delta function. But I'd still like to understand why my data (which is like the minimum of a bunch of exponential distributions), appears to also be exponential. And is there a way that I can figure out what this distribution is exactly (for large but finite N)?

Here is a picture of what the such a distribution looks like for large but finite N:
enter image description here

Here's some python code to simulate these distributions:

%matplotlib inline import math import numpy as np import matplotlib as mpl import matplotlib.pyplot as plt numpoints = 10000 NBINS = 1000  randarray1 = np.random.random_sample((numpoints,)) randarray2 = np.random.random_sample((numpoints,))  dtbin = []  for i in range(len(t1)):     dt = 10000000     for j in range(len(t2)):         delta = t1[i]-t2[j]         if abs(delta) < abs(dt):              dt = delta     dtbin.append(dt)  plt.figure() plt.hist(dtbin, bins = NBINS) 

If you had been looking for the distance to the next value above, and if you inserted an extra value at $1$ so this always had an answer, then using rotational symmetry the distribution of these distances $D$ would be the same as the distribution of the minimum of $n+1$ independent uniform random variables on $[0,1]$.

That would have $P(D le d) = 1-(1-d)^{n+1}$ and so density $f(d)=(n+1)(1-d)^n$ when $0 le d le 1$. For large $n$ and small $d$ this density can be approximated by $f(d) approx n e^{-nd}$, explaining the exponential shape you have spotted.

But your question is slightly more complicated, as you are interested in the signed distance to the nearest value above or below. As your Wikipedia link shows, the minimum of two i.i.d. exponential random variables with rate $lambda$ is an exponential random variable with rate $2lambda$. So you need to change the approximation to the density to reflect both the doubled rate and the possibility of negative values of $d$. The approximation actually becomes a Laplace distribution with $$f(d) approx n e^{-2n|d|}$$ remembering this is for large $n$ and small $d$ (in particular the true density is $0$ unless $-frac12 lt d lt frac12$). As $n$ increases, this concentrates almost all the density at $0$ as in Bayequentist's response of the limit of a Dirac delta distribution

With $n=10^6$ the approximation to the density would look like this, matching the shape of your simulated data.

enter image description here

Similar Posts:

Rate this post

Leave a Comment