Im trying to learn some hyper-parameters for SVM classifier,
I want to know if there is any correlation between the kernel parameters and the regularization parameter – C,. because if not i can then try optimizing the C parameter and only when one has being optimized start with the kernel parameter, which will save me alot of runtime.
In principle, no. One cannot optimize one parameter and then the other.
There is (at least) one paper that proposes a method to optimize first the C (using a Linear SVM) and then the gamma.
but I tried this and it did not work well on many datasets. Two problems (a) the selection it makes is not that great and (b) it takes a surprising long time – because the linear SVM is not that fast (I did not use the LibLinear implementation – I used libSVM with the linear kernel).