Which of the following two data sets is more dispersed? Assume that both data sets are normally distributed.
Data Set A: $mu_A = 1$, $sigma_A = 1$.
Data Set B: $mu_B = 10$, $sigma_B = 1$.
Answer 1: Data Set A because $CV_A = dfrac{sigma_A}{mu_A} = dfrac{1}{1} = 1 > 0.1 = dfrac{1}{10} = dfrac{sigma_B}{mu_B} = CV_B$.
Answer 2: They are equally dispersed since the graph of Data Set B is exactly the same graph as Data Set A shifted $mu_B-mu_A=10-1=9$ units to the right.
How do we reconcile these two equally convincing answers?
Best Answer
Standard deviation and coefficient of variation are both measures of dispersion of a distribution, but which one is more useful will depend on context. SD is widely applicable, but there are situations where you definitely should not use CV.
Some measurements are taken on an interval scale, meaning that there's no true, non-arbitrary zero point on the scale – you can really only compare differences in measurements. Temperature, for example, could be on an interval scale, since 0 degrees F and 0 degrees C are both equally valid, and the zero point has no special meaning (although measuring temperature in Kelvin has a non-arbitrary zero and therefore isn't an interval measure). For anything measured on an interval scale, coefficient of variation is rather meaningless, since you could shift the scale to any numerical value arbitrarily and get an entirely new CV – there's no reason why you should prefer a CV measured in Fahrenheit rather than Celsius, suggesting that it's not a useful measure. CV should only be used for ratio scales for things like mass or length that have a non-arbitrary zero point.
If the data is on a ratio scale, CV and SD are both acceptable, but must be interpreted differently. In your example, both A and B have identical SD, indicating their variation is the same in an absolute sense. The difference in means results in different CVs, however, so dataset A has greater dispersion in a relative sense, in that the distribution varies more widely as a percentage of the mean.
Similar Posts:
- Solved – Variance of a sample – proof
- Solved – Help needed on algebraic steps for Maximum Likelihood Estimation of Multivariate Normal Distribution
- Solved – Convert standardized coefficients to unstandardized (metric) coefficients for linear regression of a standardized independent variable
- Solved – Can we use a coefficient of variation as a statistic for testing homogeneity of variance
- Solved – Calculating the variance of the average of B dependent random variable