I'm currently analyzing a dataset about quality of fodder. The data for e.g. crude protein content are given in %, I don't have absolute values. Because of the data structure (I don't give you the details here), I opted for the beta-regression from R's betareg package.

I want to know the influence of various factors (and their interactions) on the crude protein content. Some factors have two levels (e.g. fertilization: with/without), others have three or more (e.g. site: site1, site2, site3 or the harvest time of the year where my samples come from (cut: spring, summer, autumn).

I have specified a model that works quite well on the data I have, residuals look well-behaved, predictors and model parameters are good, etc.

Now I would love to have some more information than the summary(mymodel) gives. I would like to test not only if other factor levels are different from my reference level but if they are significantly different from each other. In my example that would be: On which of my sites is the crude protein level in the fodder larger than on others and are those differences all significant?

I would think about doing a post-hoc test similar to e.g. the Tukey I could do if I had done a simple ANOVA. I googled, searched a lot online and asked people, but I can't find anything that works with betareg-objects.

So my question is: How do I do a post-hoc test for parameters of betareg-models in R? Or is this a bad idea? If so, why? Or is there no method yet? Please help me, thank you a lot!

I'm not a statistics/math pro, I learned almost everything I know by the modelling books by Alain Zuur and this site, so please be gentle.

My model (in the simplified version of my dataset) is specified like this:

`library(betareg) prot<-protein/100 mymodel<-betareg(prot~ sward * Fert + site + cut + repetition | site +cut, data=mydata, link="loglog") str(mydata) 'data.frame': 848 obs. of 58 variables: $ protein num 16.4 16.5 13.7 13.5 15 ... $ cut : Factor w/ 3 levels "autumn","spring",..: 1 1 2 2 3 3 1 2 2 3 ... $ loc : Factor w/ 3 levels "G","O","S": 1 1 1 1 1 1 1 1 1 1 ... $ repetition : Factor w/ 4 levels "I","II","III",..: 1 1 1 1 1 1 1 1 1 1 ... $ Fert : Factor w/ 2 levels "fertilized","non-fertilized": 1 1 1 1 1 1 1 1 1 1 ... $ sward : Factor w/ 2 levels "diverse","species-poor": 1 1 1 1 1 1 1 1 1 1 ... `

a reproducible but very simplified example code would be (note that my actual model is much larger!):

`site<-c("site1","site2","site3","site1","site2","site3","site1","site2","site3","site2") prot<-c(0.1642038, 0.1650442, 0.1369376, 0.1350139, 0.1502178, 0.1515794, 0.1354457, 0.1301206, 0.1311298, 0.1308463) fert<-c("with","without","with","without","with","without","with","without","with","without") cut<-c("autumn","autumn","spring", "spring", "summer", "summer", "autumn", "spring", "spring", "summer") mymodel<-betareg(prot~ fert + site + cut | site, link="loglog") `

**Contents**hide

#### Best Answer

Post-hoc testing for beta regressions works in the same way that it does for other maximum likelihood (regression) models. In R there are various packages with object-oriented implementations of such procedures, e.g., `lmtest`

, `car`

, `multcomp`

among others.

For testing individual hypotheses it is probably easiest to use `linearHypothesis()`

from `car`

. For example, for equality of the second and third of the `site`

effects in the mean regression:

`linearHypothesis(mymodel, "sitesite2 = sitesite3") ## Linear hypothesis test ## ## Hypothesis: ## sitesite2 - sitesite3 = 0 ## ## Model 1: restricted model ## Model 2: prot ~ fert + site + cut | site ## ## Res.Df Df Chisq Pr(>Chisq) ## 1 2 ## 2 1 1 2.2491 0.1337 `

And for the dispersion (phi) submodel, the coefficients have to be prefixed with `(phi)_`

as shown in `coef(mymodel)`

:

`linearHypothesis(mymodel, "(phi)_sitesite2 = (phi)_sitesite3") ## Linear hypothesis test ## ## Hypothesis: ## phi)_sitesite2 - phi)_sitesite3 = 0 ## ## Model 1: restricted model ## Model 2: prot ~ fert + site + cut | site ## ## Res.Df Df Chisq Pr(>Chisq) ## 1 2 ## 2 1 1 4.7682 0.02899 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 `

For testing multiple hypotheses (e.g., Dunnett or Tukey type contrasts) one can use `glht()`

from `multcomp`

in a similar fashion.

`summary(glht(mymodel, linfct = c("sitesite2 = 0", "sitesite3 = 0"))) ## Simultaneous Tests for General Linear Hypotheses ## ## Fit: betareg(formula = prot ~ fert + site + cut | site, link = "loglog") ## ## Linear Hypotheses: ## Estimate Std. Error z value Pr(>|z|) ## sitesite2 == 0 0.01488 0.03448 0.431 0.806 ## sitesite3 == 0 0.04103 0.03494 1.174 0.320 ## (Adjusted p values reported -- single-step method) summary(glht(mymodel, linfct = c("`(phi)_sitesite2` = 0", "`(phi)_sitesite3` = 0"))) ## Simultaneous Tests for General Linear Hypotheses ## ## Fit: betareg(formula = prot ~ fert + site + cut | site, link = "loglog") ## ## Linear Hypotheses: ## Estimate Std. Error z value Pr(>|z|) ## (phi)_sitesite2 == 0 1.084 1.080 1.004 0.49442 ## (phi)_sitesite3 == 0 3.443 1.155 2.982 0.00552 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## (Adjusted p values reported -- single-step method) `

Note that for `glht()`

the phi coefficient names have to be quoted. Also the `mcp()`

interface for constructing sets of hypotheses does not work for `betareg`

because there are two submodels (mean and phi) and not just one.

### Similar Posts:

- Solved – creating contrast matrix (limma) for two factorial in R
- Solved – How to prove statistical significance between the 4 seasons
- Solved – Which post-hoc is more valid for multiple comparison of an unbalanced lmer-model: lsm or mcp
- Solved – Differences between summary and anova function for multilevel (lmer) model
- Solved – picking out outliers from a GLM in R