With logistic regression, a one unit change in $X_1$ is associated with a $beta_1$ change in the log odds of 'success' (alternatively, an $exp(beta_1)$-fold change in the odds), all else being equal. But if one applies an initial normalization to cross-correlated features (e.g. subtract by mean and divide by standard deviation), is it valid to simply apply that inverse transformation to $beta_1$ to interpret a one unit change in the un-normalized value when considering the raw data? The normalization explained above has no effect on the cross-correlation of the features themselves, but I am curious if it would affect the outputs (and signs specifically) of the $beta_i$ that are being trained.
Best Answer
The interpretation of logistic regression coefficients is similar in the case where you've standardized the data (subtract mean, divide by standard deviation of each feature). By standardizing, you effectively change the units to standard deviations above/below the mean. So, a one standard deviation increase in $X_1$ corresponds to a $beta_1$ increase in the log odds. If you fit to standardized data, you can transform the coefficients back to the original units (or vice versa).
If you fit a vanilla logistic regression model to standardized vs. non-standardized data, the coefficients will take different values in each case, but both models will fit equally well (or poorly). But, this is not necessarily true if you're fitting a regularized model (e.g. $ell_1$ or $ell_2$ penalties on the coefficients). In this case, it's common practice to standardize first, so that all features are penalized equally.
Similar Posts:
- Solved – How to get odds ratio per increase in interquartile range
- Solved – Adding log odds for combined probability from logistic regression coefficients
- Solved – Convert Standardized Beta Coefficient Estimates to Raw Data Scale to Interpret Odds Ratios–Logistic Regression
- Solved – How to interpret the interaction term of a standardized variable and a binary variable correctly
- Solved – Interpreting logistic regression coefficients with a regularization term