Solved – Is it true that the type of ML model used is irrelevant

I am training a model on a dataset and all types of relevant algorithms I have used converge close to the same accuracy score, meaning that no one is significantly better performing than the other. For example, if you're training a random forest and a neural network on MNIST, you'll observe an accuracy score of around 98%. Why is this the case, that bottlenecks in performance seem to be dictated by input data rather than the choice of the algorithm?

There's a lot of truth in what you say and that's certainly the argument in what some people have branded data centric AI. For a start, a lot of academic research looks at optimizing some measure (e.g. accuracy) on a fixed given dataset (e.g. ImageNet), which kind of makes sense to measure progress in algorithms. However, in practice, instead of tinkering with minute improvements in algorithms it is often better to just get more data (or label in different ways). Similarly, in Kaggle competitions there will often be pretty small differences between well-tuned XGBoost, LightGBM, Random Forrest and certain Neural Network architectures on tabular data (plus you can often squeeze out a bit more by ensembling them), but in practice you might be pretty happy with just using of these (never mind that you could be better by a few decimal points that for many applications might be irrelevant, or at least less important than the model running fast and cheaply).

On the other hand, it is clear that some algorithms are just much better at certain tasks than others. E.g. look at the spread in performance on ImageNet, results got better year by year and e.g. the error rate got halved from 2011 to 2012 when a convolutional neural network got used. You even see a big spread in neural network performance when assessed on a newly created similar test set ranging from below 70% to over 95%. That certainly is a huge difference in performance. Or, if you get a new image classification task and have just 50 to 100 images of some reasonable size (i.e. 100 or more pixels or so in each dimension) from each class, your first thought should really be transfer learning with some kind of neural network (e.g. convolutional NN or some vision transformer) picked based on trading off good performance on ImageNet with feasible size. In contrast, it's pretty unlikely that training a RF, XGBoost, or a neural network from scratch would come anywhere near that approach in performance.

Additionally, let's not forget that often a lot is to be gained by creating the right features (especially in tabular data) or by representing the data in a good way (e.g. it turns out that you can turn audio data into spectrograms and then use neural networks for images on that, and that works pretty well). While, if one misses creating the right features or represents the data in a poor way, even a theoretically good model will struggle.

Similar Posts:

Rate this post

Leave a Comment