I did a binary logistic regression with SPSS 23 and I found some strange outcomes. This is for NOACprev until No_Prev_treatment, the last 6 variables. First of all they have very high outcomes for B, the S.E. is extremely high, the Wald is 0 and there is no 95% C.I.
Does anyone know what has gone wrong?
Best Answer
mdewey already gave a good answer. However, given that SPSS did give you parameter estimates, I suspect you don't have full separation, but more probably multicollinearity, also known simply as "collinearity" – some of your predictors carry almost the same information, which commonly leads to large parameter estimates of opposite signs (which you have) and large standard errors (which you also have). I suggest reading up on multicollinearity.
mdewey already addressed how to detect separation: this occurs if one predictor (or a set of predictors) allow a perfect fit to your binary target variable. (Multi-)collinearity is present when some subset of your predictors carry almost the same information. This is a property of your predictors alone, not of the dependent variable (in particular, the concept is the same for OLS and for logistic regression, unlike separation, which is pretty intrinsical to logistic regression). Collinearity is commonly detected using Variance Inflation Factors (VIFs), although there are alternatives.
How you should address separation or collinearity depends on your science. If you have separation, you may actually be quite happy, since you have a perfectly fitting model! In the case of collinearity, you may want to simply delete one or more of the collinear predictors, or transform them via a Principal Components Analysis (PCA), retaining only the first principal component(s). Or you may want to look at this earlier question with some excellent suggestions. In either case, I'd suggest looking at whether the original or the modified model predicts well on a new sample. (If you don't have a new sample, you may want to perform cross-validation.)
Incidentally, you don't get a confidence interval for numerical reasons. SPSS tries to take the parameter estimate, add 1.96 times the standard error, and exponentiate the result. Unfortunately, $e^{5000+}$ won't really fit into the table window…
Similar Posts:
- Solved – Comparing three groups on one categorical dependent variable
- Solved – Collinearity testing between predictors
- Solved – Generalized Linear Model in SPSS with common values among predictors treated as subpopulations. Why
- Solved – Abnormally high standard error in binary logistic regression
- Solved – Multicollinearity in an ordinal regression model