Solved – Can we express logistic loss minimization as a maximum likelihood problem

I have a simple question about the equivalence of loss minimization and likelihood maximization for logistic regression.

Say are given some data $(x_i,y_i) text{ for } ~i = 1,ldots,N$ where $x_i in mathbb{R}^d$ and $y_i in {-1,1}$.

In logistic regression, we fit the coefficients of a linear classification model $w in mathbb{R}^d$ by minimizing the logistic loss function. This results in the optimization problem:

$$ min_w sum_{i=1}^N log(1+exp(-y_iw^Tx_i)) $$

Let $w^*$ be the optimal set of coefficients obtained from this problem.

I am wondering: can we recover $w^*$ by solving a maximum likelihood problem? If so, what is the exact maximum likelihood problem that we would need to solve? And what would be the corresponding likelihood function for $w$?

It is equivalent to the maximum likelihood approach. The different appearance results from the different coding for $y_i$ (which is arbitrary).

Keeping in mind that $y_i in {-1,1}$, and denoting

$$Lambda(z) = [1+exp(-z)]^{-1}$$

we have that

$$min_w sum_{i=1}^N log[1+exp(-y_iw^Tx_i)] = max_w sum_{i=1}^N log Lambda(y_iw^Tx_i)$$

$$=max_w Big{sum_{y_i=1} log Lambda(w^Tx_i) + sum_{y_i=-1} log [1-Lambda(w^Tx_i)]Big} tag{1}$$

In the usual maximum likelihood approach we would have use the label, $z_i in {0,1}$, ($0$ instead of $-1$), and we would have assumed that

$$P(z_i =1 mid x_i) = Lambda(w^Tx_i)$$

With an independent sample, we would have arrived at the log-likelihood

$$=max_w Big{sum_{i=1}^N z_ilog Lambda(w^Tx_i) + sum_{i=1}^N (1-z_i)log [1-Lambda(w^Tx_i)]Big} tag{2}$$

Now note that

$$sum_{y_i=1} log Lambda(w^Tx_i) = sum_{i=1}^N z_ilog Lambda(w^Tx_i)$$


$$sum_{y_i=-1} log [1-Lambda(w^Tx_i)] = sum_{i=1}^N(1-z_i) log [1-Lambda(w^Tx_i)]$$

so $(1)$ and $(2)$ are equivalent.

Similar Posts:

Rate this post

Leave a Comment