Solved – Difference-in-differences with no pre-treatment

The typical difference-in-differences estimator (as fixed effects) fits a model of the form
$$
y_{it} = alpha_i + delta T_{it} + X_{it}'beta + epsilon_{it}
$$

where $T$ is some treatment that happens to $i$ at time $t$.

The coefficient $delta$ is identified from the jump between time periods when T goes from zero to one, essentially using as counterfactuals those that didn't get treated during that period, after controlling for unobservables that don't vary in time.

Normally the (panel) dataset starts with everyone un-treated, and ends with some remaining untreated while others get treated. Alternatively, if everyone (eventually) gets treated, you can still include post-treatment data to improve statistical precision — the $delta$ is still identified from the time periods where some got treated and others didn't.

My question: is it legit to fit a model where one group starts treated, the other group starts untreated, and then the untreated group gets treated? This is basically the mirror image of a situation in which one group stayed untreated and one group got treated — we still have heterogeneity in some time periods. Mathematically it seems identical — the standard error components motivations seems to still apply.

Am I missing something?

The issue I see with your approach is that you will not be able to see anything about the pre-treatment differences unless you have very precise information about the experiment or policy. It will be hard or even impossible to say something about the common trend assumption between the treatment and control groups which is a vital part of difference in differences.

For instance, say you have a job market program which is mandatory but in period 1 only motivated individuals will attend it. In period 2, which is the starting point of your data, the policy maker forces the other individuals to attend the job market program, and finally in period 3 you see all "treated" individuals. In this case it is hard to claim that those treated in period 1 and those in treated in period 2 have the same trend in their outcomes

  1. due to the unobserved factors that led to treatment selection in the first round
  2. due to the fact that individuals in period 1 have already been treated so their trend already changed (if the policy had an effect).

Of course this is a very artificial example and problematic mostly because treatment is non-random but I guess you will see the point. Without more knowledge about the experiment you can not credibly sell a difference in differences analysis in this set-up because you cannot say anything about the pre-treatment differences in the outcome of the two groups. Even if you know that treatment was random, you can't be sure about this common trend assumption. Actually, you rarely can be sure about it anyway but with pre-treatment data you can have at least an idea.

Similar Posts:

Rate this post

Leave a Comment