Association rule mining is a data mining technique that aims to discover interesting relationships, patterns, and correlations within large datasets. It focuses on identifying strong associations between different items or variables in the data. It presents these associations in the form of if-then
rules, commonly known as association rules.
An association rule consists of an antecedent (if
part) and a consequent (then
part). The dataset contains an antecedent, and we derive a consequent by using the antecedent.
Association rules are carefully derived from the dataset. Several metrics are commonly used to evaluate the performance of association rule mining algorithms. Let us consider the following transaction table.
Transaction ID | Items purchased |
1 | Item1, Item2 |
2 | Item1, Item3 ,Item4, Item5 |
3 | Item2, Item3, Item4, Item6 |
4 | Item1, Item2, Item3, Item4 |
5 | Item1, Item2, Item3, Item6 |
Support is the proportion of transactions in the dataset that contain a specific itemset. It indicates the frequency with which the itemset appears in the data. Higher support values indicate that the rule is more common or significant in the dataset. Rules with low support are considered less relevant.
In the above table, we have:
There are five transactions; three of those have
Out of five transactions,
It is the ratio of the number of transactions containing both the antecedent and the consequent to the number of transactions containing only the antecedent.
It is not symmetric, meaning the confidence for
is not the same as .
Lift quantifies how likely the consequent is to occur when the antecedent is present compared to when the two events are independent.
Different fields use association rules for analysis and calculations. Some examples are:
Market basket analysis: In market basket analysis, association rule mining finds items that customers frequently buy together to boost sales and meet some business objectives.
Healthcare: Researchers use association rules to analyze patient data, identify co-occurring medical conditions, and discover potential risk factors. This information aids in disease diagnosis, treatment planning, and medical research.
In conclusion, association rule mining is a technique that discovers interesting relationships and patterns within large datasets. It involves identifying frequent itemsets, which are combinations of items that appear together frequently in the data, and generating association rules from these frequent itemsets.
Free Resources