Solved – Is a one class naive bayes possible

I have a simple question – I think.

I have recently read a paper:

That uses a one class naive bayes. My question is – can I do the same as a one class multinomial bayes when I use a Gaussian distribution.

The above paper used a threshold to identify their class of interest in a test dataset.

If I make the following assumptions:

The standard deviation is greater than one for my features in the training data

Add the log sums of the Gaussian pdfs for all variables for each sample

Could I use a threshold, some standard deviation derived from the normal – maybe 3, to identify data points that are close to my one class training data.

Contents

According to the paper One-class document classification via Neural Networks of Manevitz and Yousef it seems to be possible to construct a one-class Naive Bayes classifier, even without a standard deviation.

I cite the relevant passage where the authors mention how to implement the core of the classifier:

We calculate $$p(d|E)$$ as the product of $$p(w|E)$$ for all words in the dictionary that appear in the document $$d$$. Each of the $$p(w|E)$$ is estimated independently using the formula:

$$p(w|E) = dfrac{n_w + 1}{n + |dictionary|}$$,

where $$n_w$$ is the number of times word $$w$$ occurs in $$E$$, and $$n$$ is the total number of words in $$E$$. We calculate a threshold $$delta$$ by the minimum over all examples in $$E$$, of the value $$p(d|E)$$ for each document in the set of examples. Then we experiment with values $$lambdacdotdelta$$ for $$0 < lambda leq 1$$ as in the previous algorithms using $$F_1$$ to find the optimal threshold for acceptance. That is, given a new document $$d$$, we accept it if the calculated value $$p(d|E)$$ is larger than the determined $$lambdacdotdelta$$. For this classifier algorithm we store $$delta$$ and $$lambda$$.

A more detailed picture of the algorithm is explained in the doctoral dissertation Characteristic Concept Representations of Piew Datta.

Rate this post