# Solved – Reformulation of OLS estimators in a simple Regression with a dumthe variable

In the Classical Regression Model i.e. \$big(E(y|x)=alpha +beta x\$ and \$Var(y|x)=sigma^2big)\$ with only two coefficients for intercept \$alpha\$ and slope \$beta\$ of a dummy variable \$x\$, we can interpret \$alpha\$ as the the mean of values for which \$x=0\$ and \$beta\$ as the difference of the means of the data where \$x=1\$ and \$x=0\$ respectively. This makes intutitive sense, but how can I formally show that I can deduce those special expression from the standard definition:
\$\$hat{alpha}=bar y -hat{beta} bar x\$\$ and
\$\$hat{beta}=frac{frac{1}{n}sum (x_i-bar x)(y_i-bar y)}{frac{1}{n}sum (x_i-bar x)^2}.\$\$

I cannot reach the formulation in terms of group means.

Contents

The theoretical model is

\$\$E(Ymid X)=alpha +beta X\$\$

Assuming that \$X\$ is a \$0/1\$ binary variable we notice that

\$\$E(Ymid X=1) – E(Ymid X=0)=alpha +beta -alpha = beta \$\$

I think that the OP asks "does the OLS estimator "mimics" this relationship, being perhaps its sample analogue?"

Let's see: we have that

\$\$hat{beta}=frac{frac{1}{n}sum (x_i-bar x)(y_i-bar y)}{frac{1}{n}sum (x_i-bar x)^2} = frac {operatorname{hat Cov(Y,X)}}{operatorname{hat Var(X)}} \$\$

Now since \$X\$ is a binary variable, i.e. a Bernoulli random variable, we have that \${operatorname{Var(X)} = p(1-p)}\$ where \$pequiv P(X=1)\$. Under a stationarity assumption, the sample estimate of this probability is simply the sample mean of \$X\$, denoted \$bar x\$ and one can verify that indeed \$\$frac{1}{n}sum (x_i-bar x)^2 = {operatorname{hat Var(X)}}=bar x (1-bar x) =hat p(1-hat p)\$\$

Let's turn now to the covariance. We have

\$\$operatorname{hat Cov(Y,X)}=frac{1}{n}sum (x_i-bar x)(y_i-bar y) = frac{1}{n}sum x_iy_i -bar x bar y\$\$

Denote \$n_1\$ the number of those observations for which \$x_i=1\$. We can write

\$\$frac{1}{n}sum x_iy_i = frac{1}{n}sum_{x_i=1} y_i = frac{n_1}{n}cdot frac{1}{n_1}sum_{x_i=1} y_i = hat pcdot (bar y mid X=1) = hat p cdot hat E(Ymid X=1)\$\$

Also \$bar y = hat E(Y)\$ and using the law of total expectations we have

\$\$hat E(Y) = hat E(Y mid X=1) cdot hat p + hat E(Y mid X=0)cdot (1-hat p)\$\$

Inserting all these results in the expression for the sample covariance we have

\$\$operatorname{hat Cov(Y,X)}= hat p cdot hat E(Ymid X=1) – hat pcdot left[hat E(Y mid X=1) cdot hat p + hat E(Y mid X=0)cdot (1-hat p)right]\$\$

\$\$= hat p(1-hat p)cdot left[hat E(Y mid X=1) – hat E(Y mid X=0)right]\$\$

Inserting all in the expression for \$hat beta\$ we have

\$\$=hat{beta} = frac {operatorname{hat Cov(Y,X)}}{operatorname{hat Var(X)}} = frac {hat p(1-hat p)cdot left[hat E(Y mid X=1) – hat E(Y mid X=0)right]}{hat p(1-hat p)} \$\$

\$\$Rightarrow hat{beta} = hat E(Y mid X=1) – hat E(Y mid X=0)\$\$

which is the sample analogue/feasible implementation of the theoretical relationship. I leave the demonstration related to the \$hat alpha\$ for the OP to work out.

Rate this post