Solved – How to figure out what kind of distribution represents this data on ping response times

I've sampled a real world process, network ping times. The "round-trip-time" is measured in milliseconds. Results are plotted in a histogram:

enter image description here

enter image description here

Latency has a minimum value, but a long upper tail.

I want to know what statistical distribution this is, and how to estimate its parameters.

Even though the distribution is not a normal distribution, I can still show what I am trying to achieve.

The normal distribution uses the function:

alt text

with the two parameters

  • μ (mean)
  • σ2  (variance)

Parameter estimation

The formulas for estimating the two parameters are:

alt text

Applying these formulas against the data I have in Excel, I get:

  • μ = 10.9558 (mean)
  • σ2  = 67.4578 (variance)

With these parameters I can plot the "normal" distribution over top my sampled data:

alt text

Obviously it's not a normal distribution. A normal distribution has an infinite top and bottom tail, and is symmetrical. This distribution is not symmetrical.


  • What principles would I apply; what
    flowchart would I apply to determine
    what kind of distribution this is?
  • Given that the distribution has no negative tail, and long positive tail: what distributions match that?
  • Is there a reference that matches distributions to the observations you're taking?

And cutting to the chase, what is the formula for this distribution, and what are the formulas to estimate its parameters?


I want to get the distribution so I can get the "average" value, as well as the "spread":
alt text

I am actually plotting the histogram in software, and I want to overlay the theoretical distribution:

alt text

Note: Cross-posted from math.stackexchange.com


Update: 160,000 samples:

enter image description here

Months and months, and countless sampling sessions, all give the same distribution. There must be a mathematical representation.


Harvey suggested putting the data on a log scale. Here's the probability density on a log scale:

enter image description here

Tags: sampling, statistics, parameter-estimation, normal-distribution


It's not an answer, but an addendum to the question. Here's the distribution buckets. I think the more adventurous person might like to paste them into Excel (or whatever program you know) and can discover the distribution.

The values are normalized

Time    Value 53.5    1.86885613545469E-5 54.5    0.00396197500716395 55.5    0.0299702228922418 56.5    0.0506460012708222 57.5    0.0625879919763777 58.5    0.069683415770654 59.5    0.0729476844872482 60.5    0.0508017392821101 61.5    0.032667605247748 62.5    0.025080049337802 63.5    0.0224138145845533 64.5    0.019703973188144 65.5    0.0183895443728742 66.5    0.0172059354870862 67.5    0.0162839664602619 68.5    0.0151688822994406 69.5    0.0142780608748739 70.5    0.0136924859524314 71.5    0.0132751080821798 72.5    0.0121849420031646 73.5    0.0119419907055555 74.5    0.0117114984488494 75.5    0.0105528076448675 76.5    0.0104219877153857 77.5    0.00964952717939773 78.5    0.00879608287754009 79.5    0.00836624596638551 80.5    0.00813575370967943 81.5    0.00760001495084908 82.5    0.00766853967581576 83.5    0.00722624372375815 84.5    0.00692099722163388 85.5    0.00679017729215205 86.5    0.00672788208763689 87.5    0.00667804592402477 88.5    0.00670919352628235 89.5    0.00683378393531266 90.5    0.00612361860383988 91.5    0.00630427469693383 92.5    0.00621706141061261 93.5    0.00596788059255199 94.5    0.00573115881539439 95.5    0.0052950923837883 96.5    0.00490886211579433 97.5    0.00505214108617919 98.5    0.0045413204091549 99.5    0.00467214033863673 100.5   0.00439181191831853 101.5   0.00439804143877004 102.5   0.00432951671380337 103.5   0.00419869678432154 104.5   0.00410525397754881 105.5   0.00440427095922156 106.5   0.00439804143877004 107.5   0.00408656541619426 108.5   0.0040616473343882 109.5   0.00389345028219728 110.5   0.00392459788445485 111.5   0.0038249255572306 112.5   0.00405541781393668 113.5   0.00393705692535789 114.5   0.00391213884355182 115.5   0.00401804069122759 116.5   0.0039432864458094 117.5   0.00365672850503968 118.5   0.00381869603677909 119.5   0.00365672850503968 120.5   0.00340131816652754 121.5   0.00328918679840026 122.5   0.00317082590982146 123.5   0.00344492480968815 124.5   0.00315213734846692 125.5   0.00324558015523965 126.5   0.00277213660092446 127.5   0.00298394029627599 128.5   0.00315213734846692 129.5   0.0030649240621457 130.5   0.00299639933717902 131.5   0.00308984214395176 132.5   0.00300885837808206 133.5   0.00301508789853357 134.5   0.00287803844860023 135.5   0.00277836612137598 136.5   0.00287803844860023 137.5   0.00265377571234566 138.5   0.00267246427370021 139.5   0.0027472185191184 140.5   0.0029465631735669 141.5   0.00247311961925171 142.5   0.00259148050783051 143.5   0.00258525098737899 144.5   0.00259148050783051 145.5   0.0023485292102214 146.5   0.00253541482376687 147.5   0.00226131592390018 148.5   0.00239213585338201 149.5   0.00250426722150929 150.5   0.0026288576305396 151.5   0.00248557866015474 152.5   0.00267869379415173 153.5   0.00247311961925171 154.5   0.00232984064886685 155.5   0.00243574249654262 156.5   0.00242328345563958 157.5   0.00231738160796382 158.5   0.00256656242602444 159.5   0.00221770928073957 160.5   0.00241705393518807 161.5   0.00228000448525473 162.5   0.00236098825112443 163.5   0.00216787311712744 164.5   0.00197475798313046 165.5   0.00203705318764562 166.5   0.00209311887170926 167.5   0.00193115133996985 168.5   0.00177541332868196 169.5   0.00165705244010316 170.5   0.00160098675603952 171.5   0.00154492107197588 172.5   0.0011150841608213 173.5   0.00115869080398191 174.5   0.00107770703811221 175.5   0.000946887108630378 176.5   0.000853444301857643 177.5   0.000822296699600065 178.5   0.00072885389282733 179.5   0.000753771974633393 180.5   0.000766231015536424 181.5   0.000566886361087923 

Bonus Reading

Best Answer

Weibull is sometimes used for modelling ping time. try a weibull distribution. To fit one in R:

x<-rweibull(n=1000,shape=2,scale=100) #generate a weibull (this should be your data). hist(x) #this is an histogram of your data. library(survival) a1<-survreg(Surv(x,rep(1,1000))~1,dist='weibull') exp(a1$coef) #this is the ML estimate of the scale parameter 1/a1$scale     #this is the ML estimate of the shape parameter 

If you're wondering for the goofy names (i.e. $scale to get the inverse of the shape) it's because "survreg" uses another parametrization (i.e. it is parametrized in terms of the "inverse weibull" which is more comon in actuarial sciences).

Similar Posts:

Rate this post

Leave a Comment