Solved – Coefficient of Variation for beween groups

I am currently looking at doing an ANOVA to check for evidence of differences between the groups mean, part of what I am doing I will be reporting the CV (sd/mean) for quantifying the amount of variation within each group, that started me thinking about quantifying the variation between groups, am I able to take the group mean across all three groups and use the group standard deviation from the ANOVA to calculate and use a between groups coefficient of variation???

When dealing with a linear model (as when conducting anova), the coefficient of variation for the model can be calculated as the root mean square error divided by the grand mean (and then multiplied by 100%).

A similar procedure could also be conducted on a single group of values.

But note that when observed values are both positive and negative, dividing by the mean may be of limited utility. In these cases, you might consider other measures of accuracy, like root mean square error.

The following uses R code, but I think it's all easy enough to follow.

Source, with the caveat that I am the author of this function: https://rdrr.io/cran/rcompanion/man/accuracy.html


Make some toy data, construct linear model, and conduct anova

Treatment = rep(c("A", "B", "C"), each = 5) Value     = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15)  Treatment     ### "A" "A" "A" "A" "A" "B" "B" "B" "B" "B" "C" "C" "C" "C" "C"  Value     ### 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15  model     = lm(Value ~ Treatment)  anova(model)     ### Analysis of Variance Table    ###    ###           Df Sum Sq Mean Sq F value    Pr(>F)        ### Treatment  2    250   125.0      50 1.513e-06 ***    ### Residuals 12     30     2.5         

The following uses the predicted values from the model (predy), the observed values (actual), and uses them to calculate the mean square error (mse), root mean square error (rmse), root mean square error divided by the grand mean (nrmse), and then this multiplied by 100% (cv_prcnt).

actual = Value predy  = predict(model)      mse      = mean((actual - predy)^2)     rmse     = sqrt(mse)     nrmse    = rmse/mean(actual)     cv_prcnt = nrmse * 100  cv_prcnt     ### 17.68 

Using this procedure on a single group will yield the same value as the population standard deviation divided by the mean. But note that this will be a different result than if the sample standard deviation is used.

A = c(1,2,3,4,5)  actual = A predy  = mean(A)      mse = mean((actual - predy)^2)     rmse = sqrt(mse)     nrmse_mean = rmse/mean(actual)     cv_prcnt = nrmse_mean * 100  cv_prcnt     ### 47.14 

This is the same result as dividing the population standard deviation divided by the mean.

population_sd = sqrt(sum((A - mean(A))^2)/(length(A)))  population_sd / mean(A)     ### 0.4714 

Software will often default to using the sample standard deviation. This will return a different result than the previous procedure.

sd(A)/mean(A)     ### 0.5270 

For R users, there is a function that will calculate CV for several types of models. (With the caveat that I am the author of this function.)

if(!require(rcompanion)){install.packages("rcompanion")}  library(rcompanion)  accuracy(list(model))      ### $Fit.criteria     ###  NRMSE.mean  CV.prcnt               0.177     17.7 

Similar Posts:

Rate this post

Leave a Comment