I hope to use one-class SVM of LIBSVM to train a training samples so as to get a model. Then, I will use the model to predict whether the new test data and the training data is same type or not. In the training process, I have some questions as follows:
- Should the training samples all be positive examples or not?
- Which kernel function can get better result, linear kernel or RBF kernel?
- What is the effect of nu's values to the model?
Should the training samples all be positive examples or not?
Yes, in one class SVM (and any other outlier detection algorithm) you need just one class. If it is positive or negative depends on your naming convention, but it it more probable, that you will seek for positive examples which are underrepresented.
Which kernel function can get better result, linear kernel or RBF kernel?
"There is no free lunch". There is no general answer, the reason behind having many kernels (not just linear and rbf) is that they work well in different applications. It is data dependant decision, so you will have to test at least those two.
What is the effect of nu's values to the model?
It corresponds to the bounds on fraction of points becoming support vectors, so it limits the model's complexity (smaller the number of SVs, simplier the model and less prone to overfitting, yet prone to underfitting). As in the http://www.cms.livjm.ac.uk/library/archive/Grid%20Computing/NoveltyDetection/sch00support.pdf paper, it directly corresponds to:
- "an upper bound on the fraction of outliers"
- "a lower bound on the fraction of SVs".