Is it possible to estimate a tobit model (e.g. a nonlinear model) with a DiD (difference-in-differences) specification? If so, how does such specification look like?

If it is possible is this implemented in R or Stata?

Comment: @Dimitriy V. Masterov: Thanks for your answer! Would this still work if I used lognormal data? I mean a log transformation of the dependent variable y causes a missing if y=0. Cameron & Trivedi suggest to "trick" Stata for lognormal data in tobit models by setting the censoring point "slightly smaller than the minimum noncensored value of ln(y)". Does this also work with a DID specification?

**Contents**hide

#### Best Answer

You can easily convince yourself that this works with a simulation, though this is not really a substitute for a proof.

D-in-D is really the just the difference between 4 means, so any model that estimates the expected value can be turned into a D-in-D estimator by using a dummy for belonging to the treatment group, a dummy for the after-treatment periods, and their interaction. The interaction is the coefficient you care about. Here's a 2 period simulation done in Stata with censoring below zero:

`. clear . set obs 10000 obs was 0, now 10000 . gen id=_n . gen TG = mod(_n,2) . expand 2 (10000 observations created) . bys id: gen after =_n . set seed 12345 . replace after = after - 1 (20000 real changes made) . gen ystar = 2 + TG + after*TG *3 + rnormal() . bys TG after: sum ystar ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ -> TG = 0, after = 0 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- ystar | 5000 2.002076 1.003261 -2.200537 5.71231 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ -> TG = 0, after = 1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- ystar | 5000 2.012059 1.00011 -1.647926 5.297162 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ -> TG = 1, after = 0 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- ystar | 5000 2.987374 .9996143 -1.112056 6.657701 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ -> TG = 1, after = 1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- ystar | 5000 5.972865 .9933744 2.354758 9.37486 . gen y = cond(ystar>0,ystar,0) . tobit y i.after##i.TG, ll(0) Tobit regression Number of obs = 20000 LR chi2(3) = 25812.39 Prob > chi2 = 0.0000 Log likelihood = -28325.453 Pseudo R2 = 0.3130 ------------------------------------------------------------------------------ y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- 1.after | .0092987 .0199684 0.47 0.641 -.029841 .0484384 1.TG | .9836551 .0199577 49.29 0.000 .9445363 1.022774 | after#TG | 1 1 | 2.976207 .0282234 105.45 0.000 2.920886 3.031527 | _cons | 2.003704 .0141204 141.90 0.000 1.976027 2.031382 -------------+---------------------------------------------------------------- /sigma | .9972531 .0050298 .9873942 1.007112 ------------------------------------------------------------------------------ Obs. summary: 214 left-censored observations at y<=0 19786 uncensored observations 0 right-censored observations `

You could have also used a panel version of the Tobit here, though you could only assume random effects unless you want to go the semi-parametric route for your problem. Finally, the Tobit relies on normality and homoskedasticity of the error term, so you may want to play with that in your simulation, as well as the degree of censoring.

On the Tobit model in logs:

I am not a huge fan of doing this. In *Microeconometrics Using Stata*, Cameron & Trivedi recommend replacing $log (y)=min{log(y mid y>0)}-0.0000001$ for cases where $y=0$. I've often found my estimates to be sensitive to how many zeros there are, so that is definitely something to play with and be honest about when reporting your results. The DiD estimator will inherit this sensitivity.