# Solved – How to understand the vertical bar (pipe) in R formulas

I came upon this because I wanted to emulate Welch's t-test using `gls`. I found the answer here:

https://stats.stackexchange.com/a/144480/141304

and it says to add weights with

`gls(y ~ group, data = dat, weights=varIdent(form = ~ 1 | group))`

`y` and `group` are variables in the model. I don't know what `form` is. I read through help on `gls`, `glm`, `weights`, etc. but couldn't find anything that addressed the issue.

Some tutorials on R formulas filled me in that the pipe means conditioning, just like in probability. I understand conditioning in probability, but I can't wrap my head around what it means in regression.

Suppose I have four predictor variables A, B, C, D and a response variable X. A and B are continuous; C and D are categorical with two levels.

What would formulas such as the ones below (or any other ones an answerer might want to explain) mean?

`X ~ A + A|B`
`X ~ A + B|C`
`X ~ A + B + C|D`

Contents

Assume there are only two groups: group 1 and group 2. The gls() call you specified fits two sub-models to your $$y$$ observations – one sub-model for the $$y$$ observations in the first group and another sub-model for the $$y$$ observations in the second group.
The sub-model for the observations $$y$$ in group 1 postulates that $$y = beta_0 + epsilon$$, where $$epsilon$$ denotes a random error term coming from a normal distribution with mean 0 and unknown variance $$sigma_1^2$$. In other words, these observations are grouped about the true group mean $$beta_0$$, with their spread about this true group mean being captured by $$sigma_1^2$$.
The sub-model for the observations y in group 2 postulates that $$y = beta_0 + beta_1 + epsilon$$, where $$epsilon$$ denotes a random error term coming from a normal distribution with mean 0 and unknown variance $$sigma_2^2$$. In other words, these observations are grouped about the true group mean $$beta_0 + beta_1$$, with their spread about this true group mean being captured by $$sigma_2^2$$.
The gls() call you provided allows the spread (or variability) of the y values in the two groups about their respective true group means to be different across groups (that is, it allows $$sigma_1^2$$ to be different from $$sigma_2^2$$) via the option weights=varIdent(form = ~ 1 | group).