Solved – Support Vector Machine(SVM) and log transformation

Why may log(natural logarithm) transformation improve results of SVM prediction(regression, eps-svm)? Is SVM based on the assumption of normal distribution or something else?

update1. I use Radial basis function kernel.

SVM doesn't assume normality. But it's still a regression that minimizes some symmetric loss function (I suppose you use symmetric kernel).

So… this is just a feeling and I'm too tired to justify/prove all this but:

  1. Probably your output variable has highly skewed distribution;
  2. And you use symmetric gaussian kernel that leads to symmetric squared loss to minimize (squared error with bump cut-off if I remember correct?);
  3. Then SVM still estimates something close to conditional mean of your data if you minimize this loss for original output variable;
  4. When you log-transform output variable and minimize that symmetric loss for it, then in terms of original variable it estimates something like a conditional median;
  5. it's well-known that mean is the thing that minimizes average squared error and median is the thing that minimizes average absolute error, so when you estimate regression using log-transformed output you get worse MSE but better MAPE.

Hope this helps.

Similar Posts:

Rate this post

Leave a Comment