In the Classical Regression Model i.e. $big(E(y|x)=alpha +beta x$ and $Var(y|x)=sigma^2big)$ with only two coefficients for intercept $alpha$ and slope $beta$ of a dummy variable $x$, we can interpret $alpha$ as the the mean of values for which $x=0$ and $beta$ as the difference of the means of the data where $x=1$ and $x=0$ respectively. This makes intutitive sense, but how can I formally show that I can deduce those special expression from the standard definition:

$$hat{alpha}=bar y -hat{beta} bar x$$ and

$$hat{beta}=frac{frac{1}{n}sum (x_i-bar x)(y_i-bar y)}{frac{1}{n}sum (x_i-bar x)^2}.$$

I cannot reach the formulation in terms of group means.

**Contents**hide

#### Best Answer

The *theoretical* model is

$$E(Ymid X)=alpha +beta X$$

Assuming that $X$ is a $0/1$ binary variable we notice that

$$E(Ymid X=1) – E(Ymid X=0)=alpha +beta -alpha = beta $$

I think that the OP asks *"does the OLS estimator "mimics" this relationship, being perhaps its sample analogue?"*

Let's see: we have that

$$hat{beta}=frac{frac{1}{n}sum (x_i-bar x)(y_i-bar y)}{frac{1}{n}sum (x_i-bar x)^2} = frac {operatorname{hat Cov(Y,X)}}{operatorname{hat Var(X)}} $$

Now since $X$ is a binary variable, i.e. a Bernoulli random variable, we have that ${operatorname{Var(X)} = p(1-p)}$ where $pequiv P(X=1)$. Under a stationarity assumption, the sample estimate of this probability is simply the sample mean of $X$, denoted $bar x$ and one can verify that indeed $$frac{1}{n}sum (x_i-bar x)^2 = {operatorname{hat Var(X)}}=bar x (1-bar x) =hat p(1-hat p)$$

Let's turn now to the covariance. We have

$$operatorname{hat Cov(Y,X)}=frac{1}{n}sum (x_i-bar x)(y_i-bar y) = frac{1}{n}sum x_iy_i -bar x bar y$$

Denote $n_1$ the number of those observations for which $x_i=1$. We can write

$$frac{1}{n}sum x_iy_i = frac{1}{n}sum_{x_i=1} y_i = frac{n_1}{n}cdot frac{1}{n_1}sum_{x_i=1} y_i = hat pcdot (bar y mid X=1) = hat p cdot hat E(Ymid X=1)$$

Also $bar y = hat E(Y)$ and using the law of total expectations we have

$$hat E(Y) = hat E(Y mid X=1) cdot hat p + hat E(Y mid X=0)cdot (1-hat p)$$

Inserting all these results in the expression for the sample covariance we have

$$operatorname{hat Cov(Y,X)}= hat p cdot hat E(Ymid X=1) – hat pcdot left[hat E(Y mid X=1) cdot hat p + hat E(Y mid X=0)cdot (1-hat p)right]$$

$$= hat p(1-hat p)cdot left[hat E(Y mid X=1) – hat E(Y mid X=0)right]$$

Inserting all in the expression for $hat beta$ we have

$$=hat{beta} = frac {operatorname{hat Cov(Y,X)}}{operatorname{hat Var(X)}} = frac {hat p(1-hat p)cdot left[hat E(Y mid X=1) – hat E(Y mid X=0)right]}{hat p(1-hat p)} $$

$$Rightarrow hat{beta} = hat E(Y mid X=1) – hat E(Y mid X=0)$$

which is the sample analogue/feasible implementation of the theoretical relationship. I leave the demonstration related to the $hat alpha$ for the OP to work out.

### Similar Posts:

- Solved – Distribution of pivotal quantity
- Solved – Covariance of conditionally independent random variables
- Solved – Relationship between negative binomial distribution and Bayesian Poisson with Gamma priors
- Solved – Relationship between negative binomial distribution and Bayesian Poisson with Gamma priors
- Solved – Expected Value of Gamma Distribution