Machine Learning Part 10: Naive Bayes Classification


machine learning

Naive Bayes is a simple yet powerful classification algorithm based on Bayes’ Theorem. It’s particularly popular for text classification and spam filtering.

Bayes’ Theorem ⚱

The heart of this algorithm is Bayes’ Theorem, which describes the probability of an event based on prior knowledge of conditions that might be related to the event.

$$P(B|A) = \frac{P(A|B) \cdot P(B)}{P(A)}$$

In machine learning terms, we can think of this as: $$P(\text{Class} | \text{Data}) = \frac{P(\text{Data} | \text{Class}) \cdot P(\text{Class})}{P(\text{Data})}$$

Why “Naive”?

The algorithm is called “Naive” because it assumes that all features are independent of each other. In reality, features are often related (e.g., if you see “Machine” and “Learning” in a text, they aren’t independent), but despite this “naive” assumption, the algorithm works surprisingly well!


Example: Are you ill? 🤒

Imagine a new rare disease called “Scarius.” You have a fever—what is the probability you actually have Scarius?

  • P(Scarius): The probability of having the disease (it’s rare: 0.0001).
  • P(Fever | Scarius): The probability that you have a fever if you have the disease (let’s say 0.96).
  • P(Fever): The probability that any random person has a fever (let’s say 0.1).

Calculation: $$P(\text{Scarius} | \text{Fever}) = \frac{0.96 \cdot 0.0001}{0.1} = 0.00096$$

Even though you have the symptom, the probability you have this specific rare disease is still less than 0.1%! This shows why considering the “Prior” ($P(B)$) is so important.

Summary

Naive Bayes is: * Fast: It requires a small amount of training data. * Scalable: It handles high-dimensional data well. * Effective: Great for things like spam detection or sentiment analysis.

Exercise

Try to derive Bayes’ Theorem using a Venn diagram. It’s a great way to visualize how conditional probability works!

Written by

Abdur-Rahmaan Janhangeer

Chef

Python author of 9+ years having worked for Python companies around the world

Suggested Posts

AdaBoost vs. Naive Bayes Algorithms in Machine Learning

In machine learning, the algorithm plays a significant role while training and building a successful...

Read article

Machine Learning Part 1: An Introduction for Beginners

Machine Learning (ML) is one of the most exciting fields in technology today. But what exactly is it...

Read article

Machine Learning part 2: supervised learning

Machine Learning ♡ supervised learning ♡ unsupervised learning ♡ reinforcement learning #2 supervise...

Read article
Free Flask Course