What is the Apriori Algorithm?
Apriori algorithm is an unsupervised machine learning algorithm that generates association rules from a given data set. Association rule implies that if an item A occurs, then item B also occurs with a certain probability. Most of the association rules generated are in the IF_THEN format. For example, IF people buy an iPad THEN they also buy an iPad Case to protect it. For the algorithm to derive such conclusions, it first observes the number of people who bought an iPad case while purchasing an iPad. This way a ratio is derived out of the 100 people who purchased an iPad, 85 people also purchased an iPad case.
Key Concepts Of Apriori Algorithm
Frequent Itemsets: The sets of the item which has minimum support.
Apriori Property: Any subset of a frequent itemset must be frequent.
Join Operation: To find L(k), a set of candidate k-itemsets is generated by joining L(k-1)with itself.
Applications of Apriori Algorithm
Google auto-complete is another popular application of the Apriori algorithm when the user types a word, the search engine looks for Other associated words that people usually type after a specific word.
Market Basket Analysis
Many e-commerce giants like Amazon use Apriori to draw data insights on which products are likely to be purchased together and whichare most responsive to promotion. For example, a retailer might use Apriori to predict that people who buy sugar and flour are likely to.
Detecting Adverse Drug Reactions
Apriori Algorithm is used for association analysis on healthcare data like-the drugs taken by
patients, characteristics of each patient, adverse ill-effects patients experience, initial diagnosis, etc. This analysis produces association rules that help identify the combination of patient characteristics and medications that lead to adverse side effects of the drugs.
Basic principle on which Apriori Machine Learning Algorithm works:
- If an item set occurs frequently then all the subsets of the item set, also occur frequently.
- If an item set occurs infrequently then all the supersets of the item set have infrequent occurrence.
Advantages of Apriori Algorithm
- It is easy to implement and can be parallelized easily.
- Apriori implementation makes use of large item set properties.