Solved – Bias and variance of coefficient estimation of logistic regression

For a linear regression problem $y=Xbeta + epsilon$, I think we know very well that the estimated $hat{beta} = dfrac{X^Ty}{X^TX}$ is unbiased, and has the variance introduced by $epsilon$.

It sounds reasonable to me that over the years, we might have a good understanding of this same question for logistic regression also, but I cannot find any.

I wonder if we have these studies for Logistic regression, or maybe it's not even possible to study these questions because Logistic regression does not have a closed-form solution of $beta$?

Answers by Alecos Papadopoulos on this site show two distinct ways in which logistic regression coefficients based on maximum likelihood estimation (MLE) are biased.

This page shows a closed-form calculation based on a simple illustrative situation. Although the probability estimates themselves are unbiased in this situation, the MLE estimates of coefficients are biased for finite sample sizes.

An additional difference between ordinary linear regression and logistic regression is the potential contribution of omitted-variable bias in the two situations. Unlike in ordinary linear regression, omitting a predictor associated with outcome in logistic regression necessarily leads to bias toward 0 in the regression coefficients of the included predictors even if the omitted predictor is uncorrelated with the included predictors. Some discussion and a nice closed-form derivation for the related case of a probit model is on this page.

Similar Posts:

Rate this post

Leave a Comment