Solved – When does it make sense to log-transform input variables in multi-variable logistic regression?

When does it make sense to log-transform input variables in multi-variable logistic regression? The transformation improves model metrics a little bit, but I'm not sure how to justify it and whether it is considered as an additional degree of freedom.

EDIT

There are several questions here about interpretation of models that are based on log-transformed data ([1], [2]). I'm interested in the question "when do we need to log-transform data?"

With respect to the input/predictor variables, the same general rule applies as for standard linear regression: you want transformations of input variables such that they bear linear relations to the output variable. The only difference here is that the output variable happens to be a log-odds.

Similarly, the issues of how this affects degrees of freedom and statistical inference are the same as for standard multiple regression. The transformations per se don't cost degrees of freedom if they are chosen without regard to the relations to the output variable. Insofar as the transformations are chosen by looking at the data, then that should be taken into account in subsequent inference. Harrell's Regression Modeling Strategies text and related on-line references provide much more detailed information.

Similar Posts:

Rate this post

Leave a Comment