Solved – the classic loss function for a convolutional auto encoder

As stated in the title, I'm working on a convolutional auto encoder with RGB images, and wondering whether the loss function should be MSE, binary cross entropy, or even custom ?

Thanks

I think people typically use binary cross entropy (BCE; aka the cost/loss function used in vanilla logistic regression) since images are typically scaled in a range from 0 to 1. (E.g., by dividing each pixel by 255.; that's the cheapest and most trivial way.)

However, I am not sure if that's the most intuitive loss function to use, since even if you feed identical images, you will get a mean BCE of ~0.5. That's because the typical implementation goes like this:

y * log(y_pred) - (1 - y) * log(1 - y_pred) 

In e.g., binary classification, either of the terms [y * log(y_pred)] or [(1 – y) * log(1 – y_pred)] will be zero. That's not the case when you compare two images, because you have continuous values.

I just plotted it for comparing 2 pixels out of curiosity:

enter image description here

In comparison, the squared loss:

enter image description here

In practice though, I never noticed any difference between BCE and MSE in terms of the results. So, I'd say either of them is fine.

Similar Posts:

Rate this post

Leave a Comment