What is FP-Growth?

Overview

Frequent pattern-growth (FP-Growth) is the mining of pattern itemsets, subsequences, and substructures that appear frequently in a dataset.

A Frequent itemset refers to the most common items bought together. A Subsequence where items are bought by a customer is called a frequent sequential pattern.

A Substructure refers to different structural forms, such as subgraphs and subtrees, that are combined with frequent structural patterns to analyze and find relations with different items of frequent itemsets.

Note: An example of frequent itemset mining is Market Basket Analysis.

Generating Frequent Pattern-Growth

To generate frequent pattern-growth, we do the following:

  • In the first scan of the database, we try to find the minimum support of itemsets.
  • If the frequency of items is less than the specified minimum support, we simply delete them from the database.
  • The set of frequent items is sorted in the descending order of support.
  • Then, an FP-tree is constructed.
  • The initial node of the FP-tree is called the root node. It is labeled null.
  • The items in the transactions are processed when the node is created.
  • Whenever we try to pass the itemset, we can create a child node and expand it if the itemset follows the root node.
  • If a branch encounters a common prefix for the existing path, we increment the value of the root node.
  • We repeat the process for each itemset until we reach the end of the itemsets in the transaction.
  • To facilitate tree traversal, we build an item header table so that each item points to occurrences in thetree via a chain of node-links.

1 of 6

Frequent Item Sets are:

  1. {K,E,M,O:1},{K,E,O:1},{K,M:1}}
  2. {{K,E,M:1},{K,E:2}}
  3. {{K,E:2},{K:1}}
  4. {{K:4}}

Free Resources