I came upon this because I wanted to emulate Welch's t-test using `gls`

. I found the answer here:

https://stats.stackexchange.com/a/144480/141304

and it says to add weights with

`gls(y ~ group, data = dat, weights=varIdent(form = ~ 1 | group))`

`y`

and `group`

are variables in the model. I don't know what `form`

is. I read through help on `gls`

, `glm`

, `weights`

, etc. but couldn't find anything that addressed the issue.

Some tutorials on R formulas filled me in that the pipe means conditioning, just like in probability. I understand conditioning in probability, but I can't wrap my head around what it means in regression.

Suppose I have four predictor variables A, B, C, D and a response variable X. A and B are continuous; C and D are categorical with two levels.

What would formulas such as the ones below (or any other ones an answerer might want to explain) mean?

`X ~ A + A|B`

`X ~ A + B|C`

`X ~ A + B + C|D`

**Contents**hide

#### Best Answer

Assume there are only two groups: group 1 and group 2. The gls() call you specified fits two sub-models to your $y$ observations – one sub-model for the $y$ observations in the first group and another sub-model for the $y$ observations in the second group.

The sub-model for the observations $y$ in group 1 postulates that $y = beta_0 + epsilon$, where $epsilon$ denotes a random error term coming from a normal distribution with mean 0 and unknown variance $sigma_1^2$. In other words, these observations are grouped about the true group mean $beta_0$, with their spread about this true group mean being captured by $sigma_1^2$.

The sub-model for the observations y in group 2 postulates that $y = beta_0 + beta_1 + epsilon$, where $epsilon$ denotes a random error term coming from a normal distribution with mean 0 and unknown variance $sigma_2^2$. In other words, these observations are grouped about the true group mean $beta_0 + beta_1$, with their spread about this true group mean being captured by $sigma_2^2$.

The gls() call you provided allows the spread (or variability) of the y values in the two groups about their respective true group means to be different across groups (that is, it allows $sigma_1^2$ to be different from $sigma_2^2$) via the option weights=varIdent(form = ~ 1 | group).

### Similar Posts:

- Solved – How to understand the vertical bar (pipe) in R formulas
- Solved – How to understand the vertical bar (pipe) in R formulas
- Solved – Relationship between noise term ($epsilon$) and MLE solution for Linear Regression Models
- Solved – Proof of MSE is unbiased estimator in Regression
- Solved – Reconciling notations for mixed models