Solved – Prediction uncertainty estimates for different kinds of models

For which kinds of supervised machine learning techniques is it possible to estimate how uncertain the model is about its predictions for given level/range of the predictor once the model is trained on a set of data?

I can imagine that it is possible to do that e.g. for random forest by looking at the variance of votes that the forest is giving for a data point in the evaluation dataset. On the other hand it seems for me impossible to estimate model uncertainty for linear regression and similar methods.

Could anyone explain for which machine learning techniques this can be done or point me to the relevant literature?

Many models can actually provide you with the uncertainty measure, first of all:

  • Naive Bayes directly models the P(y|x) probability, which is exactly what you are asking for
  • Support Vector Machine defines a hyperplane, a distance to this hyperplane is a certainty measure (closer the point, less certain is the model). In libraries like python sklearn you can access it by looking for the difference between decision_function(x) value and the intercept parameter
  • Multilayer Neural Network if you train a network for the M-classes classification task with the M-output neurons network (and the expected output for the element of first class is 1 0 0 ..., for the second 0 1 0 ... and so on), the output neurons' values can be used as a certainty measure (0 0.7 0 ... can be interpreted as "quite a member of second class), but of course some more interesting measures can be used here (like e.g. Kullback–Leibler divergence)

Similar Posts:

Rate this post

Leave a Comment