Naive Bayes is one of the simplest yet reliable algorithms used in Machine Learning, particularly in Natural Language Processing (NLP) problems.
Naive Bayes classifiers calculate the probability of a sample to be of a certain category, based on prior knowledge. They use the Naive Bayes Theorem, that assumes that the effect of a certain feature of a sample is independent of the other features.
That means that each character of a sample contributes independently to determine the probability of the classification of that sample, outputting the category of the highest probability of the sample.
Putting it in an example, a dog can be considered to be a dog by the fact that it has four legs and it has a tail. But the Naive Bayes classifier will classify as a dog to everything else that has hair or tail, for instance, a squirrel. Naive Bayes doesn’t take into consideration the relationships between variables, (we know that not because an animal has four legs or tail means that it is a dog), that is the reason it is called ‘Naive’.
There are 4 types of Naive Bayes classifiers according to the distribution of probability used in its calculation:
- Gaussian Naive Bayes: assume that the distribution of probability is Gaussian.
- Multinominal Naive Bayes: assume that the distribution of probability is Multinominal.
- Bernoulli Naive Bayes: assume that the distribution of probability is Bernullian.
How is it used in NLP?
Let me explain it with a visual example:
Given the following data:
Our task is to determine if the text ‘I feel so terrible’ is either mood category or not mood category.
Naive Bayes will count each word occurrence in the text against all the data we have. So it will identify that the words “I” and “feel” are included in the texts classified as ‘mood’.
It will count \(2\) occurrences for the ‘mood’ category and none for the ‘not mood’ category, identifying the text ‘I feel so terrible’ as ‘mood’ category.
You’ll see that this approach does not seem to be good at handling negations and sarcasm in texts, and that is why there are still making research in the NLP field, but overall, despite the simplicity of Naive Bayes, it is a good choice for text analysis.
I hope this ELI5 (explain me like I’m 5) Naive Bayes algorithm is clear enough, otherwise, comment below. I’ll be happy to help you clarify any doubt 🙂