Solved – Blocks and other questions about logistic regression with SPSS

I am trying to use logistic regression in SPSS. I am wondering, do I have to tell SPSS that, for example Gender, is a categorical variable?

Also, I am planning to add more explanatory variables in a step-by-step manner to predict a dependent variable, in total I will have 7 models. Do I have to use the blocks option in logistic regression? Or can I just do each module independently one at a time?

Welcome to Cross Validated!

"I am wondering, do I have to tell spss that, for example Gender, is a categorical variable?"

This depends on how you code your variable. If you coded the genders as 1 and 0, then the results are going to be the same regardless if you have defined it as categorical.

In all other occasions, e.g. you coded it as 1 & 2, or 0 & 2, or you recorded the variable as a string variable ("Male", "Female"), etc. You will need to use the "Categorical" button to define such variable to avoid errors.

In additions, if any of your predictors is categorical with more than two levels (e.g. high/medium/low income group), you should also use capitalize on the "Categorical" button function.


"Also, I am planning to add more explanatory variables in a step-by-step manner to predicted a dependent variable, in total I will have 7 models. Do I have to use the blocks option in logistic regression? or I can just do each module independently one at a time?"

Given no missing data, the results of the 7 models will not differ if you got them through blocking or individually. The benefit is that if you build them into blocks, SPSS will automatically provide omnibus test results informing you if the next block is a significantly improved model.

Also, if you do have missing data and you have specified blocks, SPSS will screen off all cases with any missing in any of the specified variable prior to building the step-by-step regression models (so that your omnibus test will be based on the same sample); it's a quite friendly feature.


"It is still not yet clear to me what is the difference between specifying gender as a categorical variable, and leaving it as a dichotomous variable, in terms that if I specify the categorical option, Should I report the (B) in output of Gender (0) referring to females as the (B) of gender?"

The best way to get this answered is to try both ways yourself. Here is an imaginary example. Suppose we are using male (1 = male, 0 = female) to predict happiness (1 = happy, 0 = not happy). Here is the output if we just model the predictor as a binary variable without specifying categorical:

enter image description here

The p-value of the predictor is 0.006.

Now, here is another one that we specified it as categorical:

enter image description here

Notice that the p-value is the same, but the coefficient's sign flipped from positive to negative! It is because in SPSS if you specify the variable as categorical, SPSS use the last highest level as reference group, so the variable here is practically representing female instead of male.

To avoid confusion, there is a table in the output, called "Categorical Variable Coding" (the one I showed on top of the regression outcome), remember to check that.

If you'd like to change reference group to another sex, you can do that when you click the "Categorical" button, explore a bit there and you should be able to get a good idea.


"So what you are saying that in both cases, the coefficient will remain the same, the difference is whether it will account for (Males or Females ) ? screenshots are indeed helpful, thank you."

Yes, just be cautious that sometimes, because of SPSS's preference in assigning reference group during the Categorical grouping process, we should always double check how SPSS recoded the categorical predictors by looking at the coding table in the output.

"One more comment a bit off topic, Do weights have an effect of the regression analysis ? or should I turn off the weight when doing the regression?"

Of course it does. You can run the regression with and then without weighting and you should see the sample size being different, causing other outputs to be different.

"am asking if it is necessory to use the categorial option because am having varibles with 3 groups, but I dont want to get a (B) for each group, I want the (B) of the variable it self. In example: Age (Young, Middle Aged and Older Adults). I want the (B) of age, not the (B) of young and Middle Aged as Older Adults as a reference group. is it possible to do this?"

No, you don't have to use the SPSS function, but if you don't use that, you would need to make the binary indicator yourself. In any case, never feed the categorical variable into the regression without any specification that is a categorical variable. If you do that, the results will likely be invalid.

For categorical variable with three levels, there is no way you can get one regression coefficient out of it (that's the thing you referred to as "(B)", just to make sure we are talking about the same stuff). However, there is a way to get a p-value for them (see the omnibus test table), but as far as I know you cannot get a combined regression coefficient.

Similar Posts:

Rate this post

Leave a Comment