Machine Learning Part 12: Association Analysis Explained


machine learning

Association Analysis is an unsupervised learning technique used to discover interesting relationships hidden in large datasets. It’s most famous for its use in “Market Basket Analysis.”

The Goal 🛒

Imagine you own a grocery store. You want to know which items are frequently bought together. If you find that customers who buy bread also tend to buy butter, you can place them next to each other to increase sales.

Key Terms:

  • Itemset: A collection of one or more items (e.g., {Bread, Milk, Butter}).
  • Support: How frequently an itemset appears in the dataset.
  • Confidence: How often item Y is bought when item X is bought.

Measuring Strength: Support, Confidence, and Lift ✳

1. Support

The percentage of total transactions that contain the itemset. $$Support(A) = \frac{\text{Number of transactions containing A}}{\text{Total transactions}}$$

2. Confidence

The likelihood that item B is purchased given that item A is purchased. $$Confidence(A \rightarrow B) = \frac{Support(A \cup B)}{Support(A)}$$

3. Lift

Lift measures how much more likely item B is to be bought given item A, compared to how often B is bought anyway. $$Lift(A \rightarrow B) = \frac{Support(A \cup B)}{Support(A) \cdot Support(B)}$$

  • Lift = 1: A and B are independent.
  • Lift > 1: A and B are positively associated (A makes B more likely).
  • Lift < 1: A and B are negatively associated.

The Apriori Algorithm ✳

Checking every possible combination of items in a large store would be incredibly slow. The Apriori Algorithm simplifies this by using the “Apriori Principle”:

If an itemset is frequent, then all of its subsets must also be frequent.

This allows the algorithm to “prune” (skip) thousands of combinations that couldn’t possibly be frequent, making the analysis much faster.

Summary

Association analysis helps businesses understand customer behavior and optimize store layouts, recommendation engines, and marketing campaigns.

Exercise: Look up “Market Basket Analysis” examples. You’ll find interesting (and sometimes weird) stories about items people frequently buy together!

Written by

Abdur-Rahmaan Janhangeer

Chef

Python author of 9+ years having worked for Python companies around the world

Suggested Posts

Machine Learning Part 11: Unsupervised Learning and Clustering

So far, we’ve focused on Supervised Learning. Now, we enter the world of Unsupervised Learning, wher...

Read article

Data Scaling Techniques in Machine Learning

Data and its quality affect machine learning models and their accuracy, and the quality of the data ...

Read article

Measures in Statistics for Data Science

Statistics is a critical component of data science and machine learning algorithms. Almost all the m...

Read article
Free Flask Course