Why Isn't Cross Entropy Used in SVM

Modern neural networks use cross entropy as a loss function, because of shortcomings of MSE. So, why is MSE still being used in SVM (and maybe other learners) and not replaced by cross entropy?

Neural nets are a broad class of models, and many different loss functions can be used. In fact, SVMs can be thought of as a particular kind of shallow neural net. SVMs are much narrower in scope than neural nets. You can't just pop in any arbitrary loss function, or you'd no longer have an SVM. Furthermore, cross entropy is defined on probability distributions, so it can only be used as a loss function when the classifier gives a probability distribution over classes. SVMs don't do this, so cross entropy won't work. SVMs don't use MSE, they use the hinge loss, which gives them their maximum margin properties.

