Solved – Dumthe variables and number of predictors in logistic regression

I have a problem with logistic regression.
I had found out (here) that one of the assumptions of logistic regression model should be min. of for example 50 observations per predictor. But if I had created dummy variables in Stata using "i." operator, does every dummy category count as new predictor?

example:

  • i.maritalstatus
  • 1=Ref.
  • 2
  • 3 …

Thank You very much.

It's not an assumption; it's a rule of thumb to avoid over-fitting. You'd have to ask the author of the document you link to in exactly what circumstances they envisaged it applying, but it clearly isn't going to work if you have a large number of categories per predictor, or if the number of observations in one response category is very small.

More common rules of thumb state that for each coefficient you're estimating (bar the intercept) you should have at least 10–20 observations in the least common response category. But you should check for over-fitting using bootstrap validation or cross-validation in any case when it's a concern.

Similar Posts:

Rate this post

Leave a Comment