View on GitHub

arules

Mining Association Rules and Frequent Itemsets with R

R package arules - Mining Association Rules and Frequent Itemsets

CRAN
version stream r-universe
status CRAN RStudio mirror
downloads

The arules package for R provides the infrastructure for representing, manipulating and analyzing transaction data and patterns using frequent itemsets and association rules. The package also provides a wide range of interest measures and mining algorithms including the code of Christian Borgelt’s popular and efficient C implementations of the association mining algorithms Apriori and Eclat. In addition, the following mining algorithms are available via fim4r:

Code examples can be found in Chapter 5 of the web book R Companion for Introduction to Data Mining.

arules core packages:

Additional mining algorithms

In-database analytics

Interface

Classification

Outlier Detection

Recommendation/Prediction

Installation

Stable CRAN version: Install from within R with

install.packages("arules")

Current development version: Install from r-universe.

install.packages("arules", repos = "https://mhahsler.r-universe.dev")

Usage

Load package and mine some association rules.

library("arules")
data("IncomeESL")

trans <- transactions(IncomeESL)
trans
## transactions in sparse format with
##  8993 transactions (rows) and
##  84 items (columns)
rules <- apriori(trans, supp = 0.1, conf = 0.9, target = "rules")
## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##         0.9    0.1    1 none FALSE            TRUE       5     0.1      1
##  maxlen target  ext
##      10  rules TRUE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 899 
## 
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[84 item(s), 8993 transaction(s)] done [0.01s].
## sorting and recoding items ... [42 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 4 5 6 done [0.02s].
## writing ... [457 rule(s)] done [0.00s].
## creating S4 object  ... done [0.00s].

Inspect the rules with the highest lift.

inspect(head(rules, n = 3, by = "lift"))
##     lhs                           rhs                      support confidence coverage lift count
## [1] {dual incomes=no,                                                                            
##      householder status=own}   => {marital status=married}    0.10       0.97     0.10  2.6   914
## [2] {years in bay area=>10,                                                                      
##      dual incomes=yes,                                                                           
##      type of home=house}       => {marital status=married}    0.10       0.96     0.10  2.6   902
## [3] {dual incomes=yes,                                                                           
##      householder status=own,                                                                     
##      type of home=house,                                                                         
##      language in home=english} => {marital status=married}    0.11       0.96     0.11  2.6   988

Using arules with tidyverse

arules works seamlessly with tidyverse. For example:

For example, we can remove the ethnic information column before creating transactions and then mine and inspect rules.

library("tidyverse")
library("arules")
data("IncomeESL")

trans <- IncomeESL %>%
    select(-`ethnic classification`) %>%
    transactions()
rules <- trans %>%
    apriori(supp = 0.1, conf = 0.9, target = "rules", control = list(verbose = FALSE))
rules %>%
    head(n = 3, by = "lift") %>%
    inspect()
##     lhs                           rhs                      support confidence coverage lift count
## [1] {dual incomes=no,                                                                            
##      householder status=own}   => {marital status=married}    0.10       0.97     0.10  2.6   914
## [2] {years in bay area=>10,                                                                      
##      dual incomes=yes,                                                                           
##      type of home=house}       => {marital status=married}    0.10       0.96     0.10  2.6   902
## [3] {dual incomes=yes,                                                                           
##      householder status=own,                                                                     
##      type of home=house,                                                                         
##      language in home=english} => {marital status=married}    0.11       0.96     0.11  2.6   988

Using arules from Python

See Getting started with arules using Python.

Support

Please report bugs here on GitHub. Questions should be posted on stackoverflow and tagged with arules.

References