I have a few numbers such as:
24
65
62
63
39
52
88
21
65
65
62
75
Using Excel, I am looking to identify a range of numbers whose maximum and minimum are not more than 10% different from one another, and subsequently find the range that contains the most numbers (i.e. the modal range).
The 10% difference would be defined by $$(max-min) le 0.1(max+min)/2.$$
Best Answer
Problem statement
Let the $n$ numbers be sorted so they can be written
$$x_1 le x_2 le cdots le x_n.$$
Using $2lambda$ as a general name for the 10% value, we seek an interval of the form $[x_t, x_{t+k}]$ where
$$x_{t+k} – x_t le lambda left(x_t + x_{t+k}right)$$
and $k$ is as large as possible.
Solution
That criterion is algebraically equivalent to
$$x_{t+k} le frac{1+lambda}{1-lambda}x_t.$$
Thus, all one has to do is compute the multiples of the data $mu x_t$ for $mu = frac{1+lambda}{1-lambda}$ and count how many lie within each interval of the form $[x_t, mu x_t]$.
Excel implementation
Arrange the data in a column and sort them in ascending order. To illustrate, I put them in column A
beginning at the second row.
In a parallel column (such as column B
), multiply the values by $mu$.
In another parallel column, count the intervals using COUNTIF
. The expressions in the example look like
=COUNTIF(A2:A$100, "<=" & B2) =COUNTIF(A3:A$100, "<=" & B3) ... =COUNTIF(A13:A$100, "<=" & B13)
Find the largest value(s) in this column: they are next to the desired intervals.
This is what the formulas look like:
Data
is the range A2:A13
containing the sorted values. Indicator
is the range under the heading "Mode"; its non-blank values show where the modal intervals begin. Idx
(short for "Index", which is a reserved word for Excel), Lambda
, and Mu
are cells just to the right of the corresponding names.
(I apologize that the illustrated value for $lambda$ is twice that requested in the question.)
Similar Posts:
- Solved – elastic net- confidence intervals for parameters
- Solved – elastic net- confidence intervals for parameters
- Solved – How to correctly weight predicted values from a fitted linear model
- Solved – T-test Problem, t-value extremely high and p value =0
- Solved – T-test Problem, t-value extremely high and p value =0