So… I have been trying to make a radial basis kernel for hours but I am not sure of what my final matrix should look like. I have 30 features and 200000 data points. Should my matrix K be 200000*200000 or 30*30 ?
My code so far produces 30*30:
def build_kernel(x, func, option): x = x.T K = np.zeros([x.shape[0], x.shape[0]]) for i in range(x.shape[0]): xi = x[i,:] for j in range (x.shape[0]): xj = x[j,:] K[i,j] = func(xi, xj, option) return K def radial_basis(xi, xj, gamma): r = (np.exp(-gamma*(np.linalg.norm(xi-xj)**2))) return r
My goal is to use the kernel trick in ridge regression, like it is explained here:
http://www.ics.uci.edu/~welling/classnotes/papers_class/Kernel-Ridge.pdf
But I have no idea how to implement this manually (I have to do it manually for school !)
Somebody knows how to do such a thing ? 🙂
Thanks !
Best Answer
The kernel function compares data points, so it would be $200,000 times 200,000$. (It seems that your data in x
is stored as instances by features, but then you do x = x.T
for some reason, swapping it. The matrix you've computed isn't anything meaningful as far as I know.)
That's going to be very challenging to work with on a normal personal computer; if you just removed the x = x.T
line so that your code computed the proper thing, the matrix K
would be 298 GB in memory! (Plus, the way you've implemented it with Python nested loops and 40 billion calls to the function radial_basis
, it's going to take a long time to compute even if you do have that much memory.)
This is an example of a situation where directly using the kernel trick is, frankly, a bad idea.
If you're dead-set on doing kernel ridge regression, there are various approximations you can make to make it computationally reasonable on that size of data, and I can point you to some of them. But it seems unlikely that a school assignment would really require you to do that.