Solved – When is a fixed effect truly fixed

Consider a linear unobserved effects model of the type:
$$y_{it} = X_{it}beta + c_{i} + e_{it}$$
where $c$ is an unobserved but time-invariant characteristic and $e$ is an error, $i$ and $t$ index individual observations and time, respectively. The typical approach in a fixed effects (FE) regression would be to remove $c_{i}$ via individual dummies (LSDV) / de-meaning or by first differencing.

What I have always wondered: when is $c_{i}$ truly "fixed"?

This might appear a trivial question but let me give you two examples for my reason behind it.

  1. Suppose we interview a person today and ask for her income, weight, etc. so we get our $X$. For the next 10 days we go to that same person and interview her again every day anew, so we have panel data for her. Should we treat unobserved characteristics as fixed for this period of 10 days when surely they will change at some other point in the future? In 10 days her personal ability might not change but it will when she gets older. Or asked in a more extreme way: if I interview this person every hour for 10 hours in a day, her unobserved characteristics are likely to be fixed in this "sample" but how useful is this?

  2. Now suppose we instead interview a person every month from the start to the end of her life for 85 years or so. What will remain fixed in this time? Place of birth, gender and eye color most likely but apart from that I can hardly think of anything else. But even more importantly: what if there is a characteristic which changes at one single point in her life but the change is infinitesimally small? Then it's not a fixed effect anymore because it changed when in practice this characteristic is quasi fixed.

From a statistical point it is relatively clear what is a fixed effect but from an intuitive point this is something I find hard to make sense of. Maybe someone else had these thoughts before and came up with an argument about when a fixed effect is really a fixed effect. I would very much appreciate other thoughts on this topic.

If you are interested in this formulation for causal inference about $beta$ then the unknown quantities represented by $c_i$ need only be stable for the duration of the study / data for fixed effects to identify the relevant causal quantity.

If you are concerned that the quantities represented by $c_i$ aren't stable even over this period then fixed effects won't do what you want. Then you can use random effects instead, although if you expect correlation between random $c_i$ and $X_i$ you'd want to condition $c_i$ on $bar{X}_i$ in a multilevel setup. Concern about this correlation is often one of the motivations for a fixed effects formulation because under many (but not all) circumstances you don't need to worry about it then.

In short, your concern about variation in the quantities represented by $c_i$ is very reasonable, but mostly as it affects the data for the period you have rather than periods you might have had or that you may eventually have but don't.

Similar Posts:

Rate this post

Leave a Comment