Solved – Can apriori algorithm be applied to an extremely small dataset effectively

I have a medical data set of just 50 samples and I need to study the relationships between several features. For this, I want to use the apriori algorithm but I am not sure whether it will be effective on such a small data set. Is there any other algorithms through which I can get the relationships ?

Careful when searching for "patterns" in such tiny data.

First of all, Apriori does work on small data, too. There is nothing in the algorithm that requires huge data (and in fact, Apriori does not always scale well).

The reason why Apriori etc. require support is to avoid false positives. You should not accept a low support with such tiny data. Maybe require at least 10 (better 20) out of your 50 samples to show the pattern. Otherwise you will get too many random results, and as you may know, medicine has too many irreproducible results. So whatever patterns you find: a) do a statistical test to check for their validity, and perform bias correction – and with this sample I doubt you will get anywhere near p<0.001; b) accept that you may not find anything; c) do a careful study to confirm anything you find!

Similar Posts:

Rate this post

Leave a Comment