I came upon this because I wanted to emulate Welch's t-test using gls
. I found the answer here:
https://stats.stackexchange.com/a/144480/141304
and it says to add weights with
gls(y ~ group, data = dat, weights=varIdent(form = ~ 1 | group))
y
and group
are variables in the model. I don't know what form
is. I read through help on gls
, glm
, weights
, etc. but couldn't find anything that addressed the issue.
Some tutorials on R formulas filled me in that the pipe means conditioning, just like in probability. I understand conditioning in probability, but I can't wrap my head around what it means in regression.
Suppose I have four predictor variables A, B, C, D and a response variable X. A and B are continuous; C and D are categorical with two levels.
What would formulas such as the ones below (or any other ones an answerer might want to explain) mean?
X ~ A + A|B
X ~ A + B|C
X ~ A + B + C|D
Best Answer
Assume there are only two groups: group 1 and group 2. The gls() call you specified fits two sub-models to your $y$ observations – one sub-model for the $y$ observations in the first group and another sub-model for the $y$ observations in the second group.
The sub-model for the observations $y$ in group 1 postulates that $y = beta_0 + epsilon$, where $epsilon$ denotes a random error term coming from a normal distribution with mean 0 and unknown variance $sigma_1^2$. In other words, these observations are grouped about the true group mean $beta_0$, with their spread about this true group mean being captured by $sigma_1^2$.
The sub-model for the observations y in group 2 postulates that $y = beta_0 + beta_1 + epsilon$, where $epsilon$ denotes a random error term coming from a normal distribution with mean 0 and unknown variance $sigma_2^2$. In other words, these observations are grouped about the true group mean $beta_0 + beta_1$, with their spread about this true group mean being captured by $sigma_2^2$.
The gls() call you provided allows the spread (or variability) of the y values in the two groups about their respective true group means to be different across groups (that is, it allows $sigma_1^2$ to be different from $sigma_2^2$) via the option weights=varIdent(form = ~ 1 | group).
Similar Posts:
- Solved – How to understand the vertical bar (pipe) in R formulas
- Solved – How to understand the vertical bar (pipe) in R formulas
- Solved – Relationship between noise term ($epsilon$) and MLE solution for Linear Regression Models
- Solved – Proof of MSE is unbiased estimator in Regression
- Solved – Reconciling notations for mixed models