Solved – Zero-inflated two-part models for semi-continuous data

I am trying to study predictors of companies' pollution output of some specific chemicals. The data I am using have many 0's (i.e., the company did not pollute at all with those chemicals) and then are continuous with a long right tail. I have seen others model this data by logging the dependent variable after adding 1. My sense is that this is wrong, but I don't understand why. Could someone explain? This approach is much simpler than what I think I should be doing – using zero-inflated two-part models for semi-continuous data – so I'd be thrilled if it turned out simply adding 1 and logging is right.

Second, I have found a Stata ado file to run zero-inflated two-part models for semi-continuous data. Is there a way to incorporate fixed effects into this type of model?

Not sure about Stata, but R can run zero-inflated models with fixed effects. Check out, for example, the gamlss package and zeroinfl() from the pscl package.

Similar Posts:

Rate this post

Leave a Comment