I have a more general question. Could somebody please explain what is the general difference between OLS and FE (Fixed Effects) in a very simple way? In terms of use in panel data and in general.
Thank you!
Best Answer
Suppose you observe data generated according to the panel version of the classical linear regression model, viz. $$Y_{i,t}=X_{i,t}beta+epsilon_{i,t}.$$ If $epsilon_{i,t}$ is iid over both $i$ and $t$, you can estimate $beta$ by pooled OLS without problem. The most simple way to see this intuitively is to note that since the error is iid, the double index ${i,t}$ is "unnecessary" in the sense that you do not loose any information by creating a new index for each observation, say $j$, and treat the data as if it were generated by $$Y_j=X_jbeta + epsilon _j.$$
Since the errors are assumed iid over both $i$ and $t$ it is of course iid over $j$ and you are back in the classical linear regression setting in which you presumably know all kinds of nice results for the OLS estimator. You also know that the OLS estimator of $beta$ is given by $$ hat{beta}=(X'X)^{-1}X'Y,$$ where the observations of $X_{i,t}$ are stacked in the matrix $X$ and the observations of $Y_{i,t}$ are stacked in the column vector $Y$.
Now, suppose instead that the error term really consists of two parts, viz. $$epsilon_{i,t} = alpha_i+u_{i,t},$$ where $u_{i,t}$ is $iid$ over both $i$ and $t$ while, as you see from the indexing, $alpha_i$ is an individual effect that is constant over time but differs among individuals (indexed by $i$). Thus, you cannot ignore the double indexing, i.e. the panel structure, as you could before. Moreover, if this $alpha_i$ is correlated with the regressor $X_{i,t}$ then $epsilon_{i,t}$ is also correlated with the regressor. But, as you may know, correlation between the error term and the regressor is a big problem for the OLS estimator; it is both biased and inconsistent in this setting. This is where the FE-estimator comes in. It is a way of solving said endogeneity problem and does so by using the fact that the individual effect/fixed effect is constant over time.
For each individual, you center all variables by their time-average (this is known as the within-transformation) to get $$Y_{i,t} -T^{-1} sum_{t=1}^TY_{i,t}=(X_{i,t}-T^{-1}sum_{t=1}^TX_{i,t})beta+epsilon_{i,t}-T^{-1}sum_{t=1}^Tepsilon_{i,t},$$ where $T$ is the number of time series observations in your sample. Now note that $$T^{-1}sum_{t=1}^Tepsilon_{i,t}=T^{-1}sum_{t=1}^Talpha_{i}+T^{-1}sum_{t=1}^Tu_{i,t}=alpha_i+T^{-1}sum_{t=1}^Tu_{i,t},$$ so the within-transform removes the individual effects. Now, denote $tilde{Y}_{i,t}:=Y_{i,t} -T^{-1} sum_{t=1}^TY_{i,t}$, and likewise for the regressor and the error term. Then you have the equation $$tilde{Y}_{i,t}=tilde{X}_{i,t}beta+tilde{epsilon}_{i,t}=tilde{X}_{i,t}beta+tilde{u}_{i,t}.$$ The FE-estimator is the pooled OLS of this new equation where you have 'annihilated' the 'fixed effects' $alpha_i$ by within-transformation. In other words, to compute the estimator you now proceed exactly as when there was no fixed effect; stack all observations of the within transformed variables in $tilde{Y}$ and $tilde{X}$ and compute $$hat{beta}_{FE}=(tilde{X}'tilde{X})^{-1}tilde{X}'tilde{Y}.$$
Similar Posts:
- Solved – How to prove FD and FE will give the same estimates when T = 2
- Solved – How to prove FD and FE will give the same estimates when T = 2
- Solved – How to prove FD and FE will give the same estimates when T = 2
- Solved – How to prove FD and FE will give the same estimates when T = 2
- Solved – Why are random effects shrunk towards 0