I have trained an linear SVM which takes a pair of objects, computes features and is expected to learn a semantic similarity function between objects(we can say that it predicts whether the two objects are similar enough that they should be merged or not). The problem I am facing is that the predictions can be from $-infty$ to $infty$ and I need a score from [0,1] as a semantic similarity.
One suggestion that I received was that use min-max normalization to normalize the scores. Is there a better way(which is more generic rather than depending on values of min and max from training data)? Please mention the assumptions also, in case your method has any.
Thanks
Best Answer
Use a sigmoid function $y = frac{1}{(1+e^{-f(x)})}$ where $f(x)$ is your output [-inf, inf], then y will be between [0,1]. This is a known techinique, see for example multilayer perceptron (MLP)
Similar Posts:
- Solved – What data mining/machine learning approach to use for a scoring model
- Solved – Combining multiple similarity measures
- Solved – How does python-glove compute most similar
- Solved – How does the back-propagation work in a siamese neural network
- Solved – Perform K-means (or its close kin) clustering with only a distance matrix, not points-by-features data