Overfitting is when we have a model which has memorized the training data and does not perform well in real-world cases.

Okay, say that I had some training points which look like this:

What if the red curve was the actual 'real-world' relationship. And I found this exact model through a learning algorithm based on those training observations.

Has my model overfit by definition despite it being the 'real-world' relationship? I am assuming yes, but I just want to make sure.

Thanks

**Contents**hide

#### Best Answer

I think it may be useful to rephrase the definition of overfitting to something like:

A model that does not generalize well to real-world cases

althoughit fits the training data well.

As for your example:

- If the real world looks like the red line there is by definition no overfitting.
But at the same time, if the black dots are all real-world test data you have, you probably still cannot prove this: in real-world situations, 10 cases are just not enough to prove that a function of the shown complexity was successfully fit.

To give you an idea about one real-world field: in analytical chemistry, a series of 10 concentration steps covering your desired range of analyte concentrations is usually required to show that your method yields

*linear*response.