Solved – Why does noisy data result in better prediction performance

I have tested a regression framework's robustness to noise and I have noticed in some cases that adding noise improves the prediction performance and in other cases the performance degrades.

What could be the reasons for this? If there are multiple reasons, how to I determine which is the cause?


Some more details about what I am doing.

The framework uses ridge regression. The inputs are vectors of extracted image features. The outputs are vectors of angles (in degrees, -180 to 180). To test for robustness to noise I am applying 3 levels of noise (white additive Gaussian noise) to the angles (targets) proportional to the individual angle variances (2%, 5%, and 10% of the variance of each angle).

I have noticed that in some observations, adding a small amount of noise (2-5%) leads to a small improvement in performance and in one case, all levels of noise give improvement. In my tests, the regularisation term is fixed across all noise levels, and I have ran each noise level test several times to take into account the fluctuations of the random noise.

Also, I have two broad sets of observation data. The first set was observed relatively accurately, however the second set was more complex (significantly more heterogeneous, leading to notable performance degradation relative to the first set) and exhibited a number of minor errors due to the observation technique being more limited than that which was used with the first set.

In the first set, the phenomena of better performance through adding noise did not occur. However, sometimes more noise was better than less noise.

If more information is required to better answer the question, I'd be happy to provide it.

Your description is quite sketchy. Adding noise can (seem to) improve prediction if the method of developing the predictions is overfitting. Likewise if you are overfitting you can improve prediction by deleting progressively more of your data. Depending on your sample size, "improvements" are best demonstrated by bootstrapping in 100-fold repeats of 10-fold cross-validation.

Similar Posts:

Rate this post

Leave a Comment