Solved – Is this the definition of over-fitting

Overfitting is when we have a model which has memorized the training data and does not perform well in real-world cases.

Okay, say that I had some training points which look like this:

enter image description here

What if the red curve was the actual 'real-world' relationship. And I found this exact model through a learning algorithm based on those training observations.

Has my model overfit by definition despite it being the 'real-world' relationship? I am assuming yes, but I just want to make sure.

Thanks

I think it may be useful to rephrase the definition of overfitting to something like:

A model that does not generalize well to real-world cases although it fits the training data well.

As for your example:

  • If the real world looks like the red line there is by definition no overfitting.
  • But at the same time, if the black dots are all real-world test data you have, you probably still cannot prove this: in real-world situations, 10 cases are just not enough to prove that a function of the shown complexity was successfully fit.

  • To give you an idea about one real-world field: in analytical chemistry, a series of 10 concentration steps covering your desired range of analyte concentrations is usually required to show that your method yields linear response.

Similar Posts:

Rate this post

Leave a Comment