Efficient strategies for parallel mining class association rules

14

Click here to load reader

Transcript of Efficient strategies for parallel mining class association rules

Page 1: Efficient strategies for parallel mining class association rules

Expert Systems with Applications 41 (2014) 4716–4729

Contents lists available at ScienceDirect

Expert Systems with Applications

journal homepage: www.elsevier .com/locate /eswa

Efficient strategies for parallel mining class association rules

http://dx.doi.org/10.1016/j.eswa.2014.01.0380957-4174/� 2014 Elsevier Ltd. All rights reserved.

⇑ Corresponding author. Tel.: +84 083974186.E-mail addresses: [email protected] (D. Nguyen), vdbay@it.

tdt.edu.vn (B. Vo), [email protected] (B. Le).

Dang Nguyen a, Bay Vo b,⇑, Bac Le c

a University of Information Technology, Vietnam National University, Ho Chi Minh, Viet Namb Information Technology Department, Ton Duc Thang University, Ho Chi Minh, Viet Namc Department of Computer Science, University of Science, Vietnam National University, Ho Chi Minh, Viet Nam

a r t i c l e i n f o a b s t r a c t

Keywords:Associative classificationClass association rule miningParallel computingData miningMulti-core processor

Mining class association rules (CARs) is an essential, but time-intensive task in Associative Classification(AC). A number of algorithms have been proposed to speed up the mining process. However, sequentialalgorithms are not efficient for mining CARs in large datasets while existing parallel algorithms requirecommunication and collaboration among computing nodes which introduces the high cost of synchroni-zation. This paper addresses these drawbacks by proposing three efficient approaches for mining CARs inlarge datasets relying on parallel computing. To date, this is the first study which tries to implement analgorithm for parallel mining CARs on a computer with the multi-core processor architecture. The pro-posed parallel algorithm is theoretically proven to be faster than existing parallel algorithms. The exper-imental results also show that our proposed parallel algorithm outperforms a recent sequential algorithmin mining time.

� 2014 Elsevier Ltd. All rights reserved.

1. Introduction

Classification is a common topic in machine learning, patternrecognition, statistics, and data mining. Therefore, numerous ap-proaches based on different strategies have been proposed forbuilding classification models. Among these strategies, AssociativeClassification (AC), which uses the associations between itemsetsand class labels (called class association rules), has been proven it-self to be more accurate than traditional methods such as C4.5(Quinlan, 1993) and ILA (Tolun & Abu-Soud, 1998; Tolun, Sever,Uludag, & Abu-Soud, 1999). The problem of classification basedon class association rules is to find the complete set of CARs whichsatisfy the user-defined minimum support and minimum confi-dence thresholds from the training dataset. A subset of CARs isthen selected to form the classifier. Since the first introduction in(Liu, Hsu, & Ma, 1998), tremendous approaches have been pro-posed to solve this problem. Examples include the classificationbased on multiple association rules (Li, Han, & Pei, 2001), the clas-sification model based on predictive association rules (Yin & Han,2003), the classification based on the maximum entropy (Thabtah,Cowling, & Peng, 2005), the classification based on the informationgain measure (Chen, Liu, Yu, Wei, & Zhang, 2006), the lazy-basedapproach for classification (Baralis, Chiusano, & Garza, 2008), the

use of an equivalence class rule tree (Vo & Le, 2009), the classifierbased on Galois connections between objects and rules (Liu, Liu, &Zhang, 2011), the lattice-based approach for classification (Nguyen,Vo, Hong, & Thanh, 2012), and the integration of taxonomy infor-mation into classifier construction (Cagliero & Garza, 2013).

However, most existing algorithms for associative classificationhave primarily concentrated on building an efficient and accurateclassifier but have not considered carefully the runtime perfor-mance of discovering CARs in the first phase. In fact, finding allCARs is a challenging and time-consuming problem due to two rea-sons. First, it may be hard to find all CARs in dense datasets sincethere are a huge number of generated rules. For example, in ourexperiments, some datasets can induce more than 4,000,000 rules.Second, the number of candidate rules to check is very large.Assuming there are d items and k class labels in the dataset, therecan be up to k � (2d � 1) rules to consider. Very few studies, for in-stance (Nguyen, Vo, Hong, & Thanh, 2013; Nguyen et al., 2012; Vo& Le, 2009; Zhao, Cheng, & He, 2009), have discussed the executiontime efficiency of the CAR mining process. Nevertheless, all algo-rithms have been implemented by sequential strategies. Conse-quently, their runtime performances have not been satisfied onlarge datasets, especially recently emerged dense datasets.Researchers have begun switching to parallel and distributed com-puting techniques to accelerate the computation. Two parallelalgorithms for mining CARs were recently proposed on distributedmemory systems (Mokeddem & Belbachir, 2010; Thakur &Ramesh, 2008).

Page 2: Efficient strategies for parallel mining class association rules

D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729 4717

Along with the advent of the computers with the multi-coreprocessors, more memory and computing power of processorshave been utilized so that larger datasets can be tackled in themain memory with lower cost in comparison with the usage of dis-tributed or mainframe systems. Therefore, this present study aimsto propose three efficient strategies for parallel mining CARs on themulti-core processor computers. The proposed approaches over-come two disadvantages of existing methods for parallel miningCARs. They eliminate communication and collaboration amongcomputing nodes which introduces the overhead of synchroniza-tion. They also avoid data replication and do not require data trans-fer among processing units. As a result, the proposals significantlyimprove the response time compared to the sequential counterpartand existing parallel methods. The proposed parallel algorithm istheoretically proven to be more efficient than existing parallelalgorithms. The experimental results also show that the proposedparallel algorithm can achieve up to a 2.1� speedup compared toa recent sequential CAR mining algorithm.

The rest of this paper is organized as follows. In Section 2, somepreliminary concepts of the class association rule problem and themulti-core processor architecture are briefly given. The benefits ofparallel mining on multi-core processor computers are also dis-cussed in this section. Work related to sequential and parallel min-ing class association rules are reviewed in Section 3. Our previoussequential CAR mining algorithm is summarized in Section 4 be-cause it forms the basic framework of our proposed parallel algo-rithm. The primary contributions are presented in Section 5 inwhich three proposed strategies for efficiently mining classifica-tion rules under the high performance parallel computing contextare described. The time complexity of the proposed algorithm isanalyzed in Section 6. Section 7 presents the experimental resultswhile conclusions and future work are discussed in Section 8.

2. Preliminary concepts

This section provides some preliminary concepts of the classassociation rule problem and the multi-core processor architec-ture. It also discusses benefits of parallel mining on the multi-coreprocessor architecture.

Table 1Example of a dataset.

OID A B C Class

1 a1 b1 c1 12 a1 b1 c1 23 a2 b1 c1 2

2.1. Class association rule

One of main goals of data mining is to discover important rela-tionships among items such that the presences of some items in atransaction are associated with the presences of some other items.To achieve this purpose, Agrawal and his colleagues proposed theApriori algorithm to find association rules in a transactional data-set (Agrawal & Srikant, 1994). An association rule has the formX ? Y where X, Y are frequent itemsets and X \ Y = £. The problemof mining association rules is to find all association rules in a data-set having support and confidence no less than user-defined min-imum support and minimum confidence thresholds.

Class association rule is a special case of association rule inwhich only the class attribute is considered in the rule’s right-handside (consequent). Mining class association rules is to find the setof rules which satisfy the minimum support and minimum confi-dence thresholds specified by end-users. Let us define the CARproblem as follows.

Let D be a dataset with n attributes {A1, A2, . . . , An} and |D| re-cords (objects) where each record has an object identifier (OID).Let C={c1,c2, . . . , ck} be a list of class labels. A specific value of anattribute Ai and class C are denoted by lower-case letters aim andcj, respectively.

Definition 1. An item is described as an attribute and a specificvalue for that attribute, denoted by h(Ai, aim)i and an itemset is a setof items.

Definition 2. Let I ¼ fhðA1; a11Þi; . . . ; hðA1; a1m1 Þi; hðA2; a21Þi; . . . ;

hðA2; a2m2 Þi; . . . ; hðAn; an1Þi; . . . ; hðAn; anmn Þig be a finite set of items.Dataset D is a finite set of objects, D={OID1,OID2, . . . ,OID|D|} inwhich each object OIDx has the form OIDx =attr(OIDx) ^ class(OIDx)(1 6 x 6 |D|) with attr(OIDx) # I and class(OIDx)eC. For example,OID1 for the dataset shown in Table 1 is {h(A, a1)i, h(B, b1)i,h(C, c1)i} ^ {1}.

Definition 3. A class association rule R has the form itemset ? cj,where cj e C is a class label.

Definition 4. The actual occurrence ActOcc(R) of rule R in D is thenumber of objects of D that match R’s antecedent, i.e.,ActOcc(R) = |{OID|OID e D ^ itemset # attr(OID)}|.

Definition 5. The support of rule R, denoted by Supp(R), is thenumber of objects of D that match R’s antecedent and are labeledwith R’s class. Supp(R) is defined as:

SuppðRÞ ¼ jfOIDjOID 2 D ^ itemset # attrðOIDÞ ^ cj ¼ classðOIDÞgj

Definition 6. The confidence of rule R, denoted by Conf(R), isdefined as:

Conf ðRÞ ¼ SuppðRÞActOccðRÞ

A sample dataset is shown in Table 1. It contains three objects, threeattributes (A, B, and C), and two classes (1 and 2). Considers ruleR: h(A, a1)i? 1. We have ActOcc(R) = 2 and Supp(R) = 1 since thereare two objects with A = a1, in that one object (object 1) also con-tains class 1. We also have Conf ðRÞ ¼ SuppðRÞ

ActOccðRÞ ¼ 12.

2.2. Multi-core processor architecture

A multi-core processor (shown in Fig. 1) is a single computingcomponent with two or more independent central processing units(cores) in the same physical package (Andrew, 2008). The proces-sors were originally designed with only one core. However, mul-ti-core processors became mainstream when Intel and AMDintroduced their commercial multi-core chip in 2008 (Casali &Ernst, 2013). A multi-core processor computer has different speci-fications from either a computer cluster (Fig. 2) or a SMP (Symmet-ric Multi-processor) system (Fig. 3): the memory is not distributedlike in a cluster but rather is shared. It is similar to the SMP archi-tecture. Many SMP systems, however, have the NUMA (Non Uni-form Memory Access) architecture. There are several memoryblocks which are accessed with different speeds from each proces-sor depending on the distance between the memory block and theprocessor. On the contrary, the multi-core processors are usuallyon the UMA (Uniform Memory Access) architecture. There is one

Page 3: Efficient strategies for parallel mining class association rules

Chip

Core

Core

Thread Thread

Memory

Fig. 1. Multi-core processor: one chip, two cores, two threads (Source: http://software.intel.com/en-us/articles/multi-core-processor-architecture-explained).

4718 D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729

memory block only, so all cores have an equal access time to thememory (Laurent, Négrevergne, Sicard, & Termier, 2012).

2.3. Parallel mining on the multi-core processor architecture

Obviously, the multi-core processor architecture has manydesirable properties, for example each core has direct and equal ac-cess to all the system’s memory and the multi-core chip also allowshigher performance at lower energy and cost. Therefore, numerousresearchers have developed parallel algorithms on the multi-coreprocessor architecture in the data mining literature. One of the firstalgorithms targeting multi-core processor computers was FP-arrayproposed by Liu and his colleagues in 2007 (Liu, Li, Zhang, & Tang,2007). The authors proposed two techniques, namely a cache-conscious FP-array and a lock-free dataset tiling parallelism mech-anism for parallel discovering frequent itemsets on the multi-coreprocessor machines. Yu and Wu (2011) proposed an efficient loadbalancing strategy in order to reduce massive duplicated generated

Processor

Memory

Processor

Memory

Processor

Memory

Fig. 2. Computer cluster (Source: http://en.wi

candidates. Their main contribution was to enhance the task ofcandidate generation in the Apriori algorithm on the multi-coreprocessor computers. Schlegel, Karnagel, Kiefer, and Lehner(2013) recently adapted the well-known Eclat algorithm to ahighly parallel version which runs on the multi-core processor sys-tem. They proposed three parallel approaches for Eclat: indepen-dent class, shared class, and shared itemset. Parallel mining hasalso been widely adopted in many other research fields, such asclosed frequent itemset mining (Negrevergne, Termier, Méhaut, &Uno, 2010), gradual pattern mining (Laurent et al., 2012), corre-lated pattern mining (Casali & Ernst, 2013), generic pattern mining(Negrevergne, Termier, Rousset, & Méhaut, 2013), and tree-struc-tured data mining (Tatikonda & Parthasarathy, 2009).

While many researches have been devoted to develop parallelpattern mining and association rule mining algorithms relied onthe multi-core processor architecture, no studies have publishedregarding the parallel class association rule mining problem. Thus,this paper proposes the first algorithm for parallel mining CARswhich can be executed efficiently on the multi-core processorarchitecture.

3. Related work

This section begins with the overview of some sequential ver-sions of CAR mining algorithm and then provides details abouttwo parallel versions of it.

3.1. Sequential CAR mining algorithms

The first algorithm for mining CARs was proposed by Liu et al.(1998) based on the Apriori algorithm (Agrawal & Srikant, 1994).After its introduction, several other algorithms adopted its ap-proach, including CAAR (Xu, Han, & Min, 2004) and PCAR (Chen,Hsu, & Hsu, 2012). However, these methods are time-consumingbecause they generate a lot of candidates and scan the dataset sev-eral times. Another approach for mining CARs is to build the fre-quent pattern tree (FP-tree) (Han, Pei, & Yin, 2000) to discoverrules, which was presented in some algorithms such as CMAR (Liet al., 2001) and L3 (Baralis, Chiusano, & Garza, 2004). The mining

Processor

Memory

kipedia.org/wiki/Distributed_computing).

Page 4: Efficient strategies for parallel mining class association rules

MainMemory

Processor 1 Processor nProcessor 2

Cache Cache Cache I/O

Bus ArbiterSystem Bus

...

Fig. 3. Symmetric multi-processor system (Source: http://en.wikipedia.org/wiki/Symmetric_multiprocessing).

D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729 4719

process used by the FP-tree does not generate candidate rules.However, its significant weakness lies in the fact that the FP-treedoes not always fit in the main memory. Several algorithms, MMAC(Thabtah, Cowling, & Peng, 2004), MCAR (Thabtah et al., 2005), andMCAR (Zhao et al., 2009), utilized the vertical layout of the datasetto improve the efficiency of the rule discovery phase by employinga method that extends the tidsets intersection method mentionedin (Zaki, Parthasarathy, Ogihara, & Li, 1997). Vo and Le proposedanother method for mining CARs by using an equivalence class ruletree (ECR-tree) (Vo & Le, 2009). An efficient algorithm, called ECR-CARM, was also proposed in their paper. The two strong featuresdemonstrated by ECR-CARM are that it scans the dataset only onceand uses the intersection of object identifiers to determine the sup-port of itemsets quickly. However, it needs to generate and test ahuge number of candidates because each node in the tree containsall values of a set of attributes. Nguyen et al. (2013) modified theECR-tree structure to speed up the mining process. In their en-hanced tree, named MECR-tree, each node contains only one valueinstead of the whole group. They also provided theorems to iden-tify the support of child nodes and prune unnecessary nodesquickly. Based on MECR-tree and these theorems, they presentedthe CAR-Miner algorithm for effectively mining CARs.

It can be seen that many sequential algorithms of CAR mininghave been developed but very few parallel versions of it have beenproposed. Next section reviews two parallel algorithms of CARmining which have been mentioned in the associative classifica-tion literature.

3.2. Parallel CAR mining algorithms

One of the primary weaknesses of sequential versions of CARmining is that they are unable to provide the scalability in termsof data dimension, size, or runtime performance for such largedatasets. Consequently, some researchers recently have tried toapply parallelism to current sequential CAR mining algorithms torelease the sequential bottleneck and improve the response time.Thakur and Ramesh (2008) proposed a parallel version for theCBA algorithm (Liu et al., 1998). Their proposed algorithm wasimplemented on a distributed memory system and based on dataparallelism. The parallel CAR mining phase is an adaption of theCD approach which was originally proposed for parallel mining fre-quent itemsets (Agrawal & Shafer, 1996). The training dataset waspartitioned into P parts which were computed on P processors.Each processor worked on its local data to mine CARs with thesame global minimum support and minimum confidence. How-ever, this algorithm has three big weaknesses as follows. First, ituses a static load balance which partitions work among processors

by using a heuristic cost function. This causes a high load imbal-ance. Second, a high synchronization happens at the end of eachstep. Final, each site must keep the duplication of the entire setof candidates. Additionally, the authors did not provide any exper-iments to illustrate the performance of the proposed algorithm.Mokeddem and Belbachir (2010) proposed a distributed versionfor FP-Growth (Han et al., 2000) to discover CARs. Their proposedalgorithm was also employed on a distributed memory systemand based on the data parallelism. Data were partitioned into Pparts which were computed on P processors for parallel discover-ing the subsets of classification rules. An inter-communicationwas established to make global decisions. Consequently, their ap-proach faces the big problem of high synchronization amongnodes. In addition, the authors did not conduct any experimentsto compare their proposed algorithm with others.

Two existing parallel algorithms for mining CARs which wereemployed on distributed memory systems have two significantproblems: high synchronization among nodes and data replication.In this paper, a parallel CAR mining algorithm based on the multi-core processor architecture is thus proposed to solve thoseproblems.

4. A sequential class association rule mining algorithm

In this section, we briefly summarize our previous sequentialCAR mining algorithm as it forms the basic framework of our pro-posed parallel algorithm.

In (Nguyen & Vo, 2014), we proposed a tree structure to mineCARs quickly and directly. Each node in the tree contains one item-set along with:

(1) (Obidset1, Obidset2, . . . ,Obidsetk) – A list of Obidsets in whicheach Obidseti is a set of object identifiers that contain boththe itemset and class ci. Note that k is the number of classesin the dataset.

(2) pos – A positive integer storing the position of the class withthe maximum cardinality of Obidseti, i.e.,pos = argmaxie[1,k]{|Obidseti|}.

(3) total – A positive integer which stores the sum of cardinalityof all Obidseti, i.e., total ¼

Pki¼1ðjObidsetijÞ.

However, the itemset is converted to the form att � values foreasily programming, where

(1) att – A positive integer represents a list of attributes.(2) values – A list of values, each of which is contained in one

attribute in att.

Page 5: Efficient strategies for parallel mining class association rules

{ }

( )1 1 1,2a× ( )1 2 ,3a× ∅ ( )2 1 1,23b× ( )4 1 1,23c×

( )3 1 1 1,2a b× ( )5 1 1 1,2a c× ( )3 2 1 ,3a b× ∅ ( )5 2 1 ,3a c× ∅ ( )6 1 1 1,23b c×

( )7 1 1 1 1,2a b c× ( )7 2 1 1 ,3a b c× ∅

Fig. 5. Tree generated by sequential-CAR-mining for the dataset in Table 1.

4720 D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729

For example, itemset X = {h(B, b1)i, h(C, c1)i} is denoted asX = 6 � b1c1. A bit representation is used for storage of itemsetattributes to save memory usage. Attributes BC can be representedas 110 in bit representation, so the value of these attributes is 6.Bitwise operations are then used to quickly join itemsets.

In Table 1, itemset X = {h(B, b1)i, h(C, c1)i} is contained in objects1, 2 and 3. Thus, the node which contains itemset X has the form6 � b1c1(1, 23) in which Obidset1 = {1} (or Obidset1 = 1 for short)(i.e., object 1 contains both itemset X and class 1), Obidset2 = {2, 3}(or Obidset2 = 23 for short) (i.e., objects 2 and 3 contain both item-set X and class 2), pos = 2 (denoted by a line under Obidset2, i.e., 23),and total = 3. pos is 2 because the cardinality of Obidset2 for class 2is maximum (2 versus 1).

Obtaining support and confidence of a rule becomes computing|Obidsetpos| and jObidsetpos j

total , respectively. For example, node6 � b1c1(1, 23) generates rule {h(B,b1)i, h(C,c1)i} ? 2 (i.e., if B = b1and C = c1, then Class = 2) with Supp = |Obidset2| = |23| = 2 andConf ¼ 2

3.Based on the tree structure, we also proposed a sequential algo-

rithm for mining CARs, called Sequential-CAR-Mining, as shown inFig. 4. Firstly, we find all frequent 1-itemsets and add them to theroot node of the tree (Line 1). Secondly, we recursively discoverother frequent k-itemsets based on the Depth-First Search strategy(procedure Sequential-CAR-Mining). Thirdly, while traversing

Input: Dataset ,D minSup and minConf

Output: All CARs satisfying minSup and minConf

Procedure:

1. Let rL be the root node of the tree. rL include

frequent 1-itemset.

Sequential-CAR-Mining( ,rL minSup, minConf)

2. CARs=∅ ;

3. for all .childrenx rl L∈ do

4. Generate-Rule( xl , minConf);

5. iP = ∅ ;

6. for all .childreny rl L∈ , with y x> do

7. if . .y xl att l att≠ then // two nodes are co

8. . . | .x yO att l att l att= ; // using bitwise

9. . . .x yO values l values l values= ∪ ;

10. . . .i x i yO Obidset l Obidset l Obidse= ∩

11. [ ] { }1,. argmax . ii kO pos OObidset∈= ;

12.1

. .k

ii

O total OObidset=

= ∑ ;

13. if .. O posOObidset ≥minSup then // n

14. i iP P O= ∪ ;

15. Sequential-CAR-Mining( ,iP minSup, minCo

Generate-Rule( ,l minConf)

16. .conf . / .l posl Obidset l total= ;

17. if conf ≥minConf then

18. CARs=CARs ({ .. .pos l pl itemset c l Obidset∪ →

Fig. 4. Sequential algorith

nodes in the tree, we also generate rules which satisfy the mini-mum confidence threshold (procedure Generate-Rule). The pseudocode of the algorithm is shown in Fig. 4.

Fig. 5 shows the tree structure generated by the sequential CARmining algorithm for the dataset shown in Table 1. For details onthe tree generation, please refer to the study by Nguyen and Vo (2014).

5. The proposed parallel class association rule mining algorithm

Although Sequential-CAR-Mining is an efficient algorithm formining all CARs, its runtime performance reduces significantly onlarge datasets due to the computational complexity. As a result,

s a set of nodes in which each node contains a

mbined only if their attributes are different

operation

it ; // [ ]1,i k∀ ∈

ode O satisfies minSup

nf);

)}, confos ;

m for mining CARs.

Page 6: Efficient strategies for parallel mining class association rules

Input: Dataset ,D minSup and minConf

Output: All CARs satisfying minSup and minConf

Procedure:

1. Let rL be the root node of the tree. rL includes a set of nodes in which each node contains a

frequent 1-itemset.

PMCAR( ,rL minSup, minConf)

2. totalCARs=CARs=∅ ;

3. for all .childrenx rl L∈ do

4. Generate-Rule(CARs, xl , minConf);

5. iP = ∅ ;

6. for all .childreny rl L∈ , with y x> do

7. if . .y xl att l att≠ then // two nodes are combined only if their attributes are different

8. . . | .x yO att l att l att= ; // using bitwise operation

9. . . .x yO values l values l values= ∪ ;

10. . . .i x i y iO Obidset l Obidset l Obidset= ∩ ; // [ ]1,i k∀ ∈

11. [ ] { }1,. argmax . ii kO pos OObidset∈= ;

12.1

. .k

ii

O total OObidset=

= ∑ ;

13. if .. O posOObidset ≥minSup then // node O satisfies minSup

14. i iP P O= ∪ ;

15. Task it = new Task(() => {

Sub-PMCAR(tCARs, ,iP minSup, minConf); });

16. for each task in the list of created tasks do

17. collect the set of rules ( tCARs ) returned by each task;

18. totalCARs totalCARs tCARs= ∪ ;

19. totalCARs totalCARs CARs= ∪ ;

Sub-PMCAR(tCARs, ,rL minSup, minConf)

20. for all .childrenx rl L∈ do

21. Generate-Rule(tCARs, xl , minConf);

22. iP = ∅ ;

23. for all .childreny rl L∈ , with y x> do

24. if . .y xl att l att≠ then // two nodes are combined only if their attributes are different

25. . . | .x yO att l att l att= ; // using bitwise operation

26. . . .x yO values l values l values= ∪ ;

27. . . .i x i y iO Obidset l Obidset l Obidset= ∩ ; // [ ]1,i k∀ ∈

28. [ ] { }1,. argmax . ii kO pos OObidset∈= ;

29.1

. .k

ii

O total OObidset=

= ∑ ;

30. if .. O posOObidset ≥minSup then // node O satisfies minSup

31. i iP P O= ∪ ;

32. Sub-PMCAR(tCARs, ,iP minSup, minConf);

Fig. 6. PMCAR with independent branch strategy.

D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729 4721

Page 7: Efficient strategies for parallel mining class association rules

{ }

( )1 1 1,2a× ( )1 2 ,3a× ∅ ( )2 1 1,23b× ( )4 1 1,23c×

( )3 1 1 1,2a b× ( )5 1 1 1,2a c× ( )3 2 1 ,3a b× ∅ ( )5 2 1 ,3a c× ∅ ( )6 1 1 1,23b c×

( )7 1 1 1 1,2a b c× ( )7 2 1 1 ,3a b c× ∅t1 t2 t3

Fig. 7. Illustration of the independent branch strategy.

4722 D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729

we have tried to apply parallel computing techniques to thesequential algorithm to speed up the mining process.

Schlegel et al. (2013) recently adapted the well-known Eclatalgorithm to a highly parallel version which runs on the multi-coreprocessor system. They proposed three parallel approaches forEclat: independent class, shared class, and shared itemset. In the‘‘independent class’’ strategy, each equivalence class is distributedto a single thread which mines its assigned class independentlyfrom other threads. This approach has an important advantage inthat the synchronization cost is low. It, however, consumes muchhigher memory than the sequential counterpart because all threadshold entire their tidsets at the same time. Additionally, this strategyoften causes high load imbalances when a large number of threadsare used. Threads mine light classes often finish sooner thanthreads mine heavier classes. In the ‘‘shared class’’ strategy, a singleclass is assigned to multiple threads. This can reduce the memoryconsumption but increase the cost of synchronization since onethread has to communicate to others to obtain their tidsets. In thefinal strategy, ‘‘shared itemset’’, multiple threads concurrently per-form the intersection of two tidsets for a new itemset. In this strat-egy, threads have to synchronize with each other with a high cost.

Basically, the proposed algorithm, Parallel Mining Class Associ-ation Rules (PMCAR), is a combination of Sequential-CAR-Miningand parallel ideas mentioned in (Schlegel et al., 2013). It has thesame core steps as Sequential-CAR-Mining where it scans the data-set once to obtain all frequent 1-itemsets along with their Obid-sets, and it then starts recursively mining. It also adopts twoparallel strategies ‘‘independent class’’ and ‘‘shared class’’. How-ever, PMCAR has some differences as follows. PMCAR is a parallelalgorithm for mining class association rules while the work doneby Schlegel et al. focuses on mining frequent itemsets only. Addi-tionally, we also propose a third parallel strategy shared Obidsetfor PMCAR. PMCAR is employed on a single system with the mul-ti-core processor where the main memory can be shared with andequally accessed by all cores. Hence, PMCAR does not require syn-chronization among computing nodes like other parallel CAR min-ing algorithms employed on distributed memory systems.Compared to Sequential-CAR-Mining, the main differences be-tween PMCAR and Sequential-CAR-Mining in terms of parallelCAR mining strategies are discussed in the following sections.

5.1. Independent branch strategy

The first strategy, independent branch, distributes each branch ofthe tree to a single task, which mines assigned branch independentlyfrom all other tasks to generate CARs. General speaking, this strategyis similar to the ‘‘independent class’’ strategy mentioned in (Schlegelet al., 2013) except that PMCAR uses the different tree structure forthe purpose of CAR mining and it is implemented by using tasks in-stead of threads. As mentioned above, this strategy has some limita-tions such as high load imbalances and high memory consumption.However, the primary advantage of this strategy is that each task isexecuted independently from other tasks without any synchroniza-tion. In our implementation, the algorithm is employed based on theparallelism model in .NET Framework 4.0. Instead of using threads,

our algorithm uses tasks that have more advantageous than threads.First, task consumes less memory usage than thread. Second, while asingle thread runs on a single core, tasks are designed to be aware ofthe multi-core processor and multiple tasks can be executed on asingle core. Final, using threads takes much time because operatingsystems must allocate data structures of threads, initialize, destroythem, and also perform the context switches between threads. Con-sequently, our implementation can solve two problems: high mem-ory consumption and high imbalance.

The pseudo code of PMCAR with independent branch strategy isshown in Fig. 6.

We apply the algorithm to the sample dataset shown in Table 1 toillustrate its basic ideas. First, PMCAR finds all frequent 1-itemsets asdone in Sequential-CAR-Mining (Line 1). After this step, we haveLr = {1� a1(1, 2), 1 � a2(£, 3), 2 � b1(1, 23), 4 � c1(1, 23)}. Second,PMCAR calls procedure PMCAR to generate frequent 2-itemsets (Lines3–14). For example, consider node 1 � a1(1, 2). This node combineswith two nodes 2� b1(1, 23) and 4� c1(1, 23) to generate two newnodes 3 � a1b1(1, 2) and 5� a1c1(1, 2). Note that node 1 � a1(1, 2)does not combine with node 1 � a2(£, 3) since they have the sameattribute (attribute A) which causes the support of the new node iszero regarding Theorem 1 mentioned in (Nguyen & Vo, 2014). Afterthese steps, we have Pi = {3� a1b1(1, 2), 5 � a1c1(1, 2)}. Then,PMCAR creates a new task ti and calls procedure Sub-PMCAR insidethat task with four parameters tCARs, minSup, minConf, and Pi. The firstparameter tCARs is used to store the set of rules returned by Sub-PMCAR in a task (Line 15). For instance, task t1 is created and proce-dure Sub-PMCAR is executed inside t1. Procedure Sub-PMCAR isrecursively called inside a task to mine all CARs (Lines 20–32). Forexample, task t1 also generates node 7 � a1b1c1(1, 2) and its rule. Fi-nally, after all created tasks completely mine all assigned branches,their results are collected and form the complete set of rules (Lines16–19). In Fig. 7, three tasks t1, t2, and t3 represented by solid blocksparallel mine three branches a1, a2, and b1 independently.

5.2. Shared branch strategy

The second strategy, shared branch, adopts the same ideas of the‘‘shared class’’ strategy mentioned in Schlegel et al. (2013). In thisstrategy, each branch is parallel mined by multiple tasks. The pseu-do code of PMCAR with shared branch strategy is shown in Fig. 8.First, the algorithm initializes the root node Lr (Line 1). Then, theprocedure PMCAR is recursively called to generate CARs. Whennode lx combines with node ly, the algorithm creates a new taskti and performs the combination code inside that task (Lines 7–17). Note that because multiple tasks concurrently mine the samebranch, synchronization happens to collect necessary informationfor the new node (Line 18). Additionally, to avoid a data race(i.e., two or more tasks perform operations that update a sharedpiece data) (Netzer & Miller, 1989), we use a lock object to coordi-nate tasks’ access to the share data Pi (Lines 15 and 16).

We also apply the algorithm to the dataset in Table 1 to demon-strate its work. As an example, we can discuss node 1 � a1(1, 2).The algorithm creates task t1 to combine node 1 � a1(1, 2) withnode 2 � b1(1, 23) to generate node 3 � a1b1(1, 2); it parallel cre-ates task t2 to combine node 1 � a1(1, 2) with node 4 � c1(1, 23) togenerate node 5 � a1c1(1, 2). However, before the algorithm con-tinues creating task t3 to generate node 7 � a1b1c1(1, 2), it hasto wait till tasks t1 and t2 finishing their works. Therefore, thisstrategy is slower than the first one in execution time. In Fig. 9,three tasks t1, t2, and t3 parallel mine the same branch a1.

5.3. Shared Obidset strategy

The third strategy, shared Obidset, is different from the ‘‘shareditemset’’ strategy discussed in Schlegel et al. (2013). Each task has a

Page 8: Efficient strategies for parallel mining class association rules

{ }

( )1 1 1,2a× ( )1 2 ,3a× ∅ ( )2 1 1,23b× ( )4 1 1,23c×

( )3 1 1 1,2a b× ( )5 1 1 1,2a c× ( )3 2 1 ,3a b× ∅ ( )5 2 1 ,3a c× ∅ ( )6 1 1 1,23b c×

( )7 1 1 1 1,2a b c× ( )7 2 1 1 ,3a b c× ∅

t1 t2

t3

Fig. 9. Illustration of the shared branch strategy.

Input: Dataset ,D minSup and minConf

Output: All CARs satisfying minSup and minConf

Procedure:

1. Let rL be the root node of the tree. rL includes a set of nodes in which each node contains a

frequent 1-itemset.

PMCAR( ,rL minSup, minConf)

2. CARs=∅ ;

3. for all .childrenx rl L∈ do

4. Generate-Rule( xl , minConf);

5. iP = ∅ ;

6. for all .childreny rl L∈ , with y x> do

7. Task it = new Task(() => {

8. if . .y xl att l att≠ then

9. . . | .x yO att l att l att= ; // using bitwise operation

10. . . .x yO values l values l values= ∪ ;

11. . . .i x i y iO Obidset l Obidset l Obidset= ∩ ; // [ ]1,i k∀ ∈

12. [ ] { }1,. argmax . ii kO pos OObidset∈= ;

13.1

. .k

ii

O total OObidset=

= ∑ ;

14. if .. O posOObidset ≥minSup then // node O satisfies minSup

15. lock(lockObject)

16. i iP P O= ∪ ;

17. });

18. Task.WaitAll( it );

19. PMCAR( ,iP minSup, minConf);

Fig. 8. PMCAR with shared branch strategy.

D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729 4723

different branch assigned and its child tasks process together anode in the branch. The pseudo code of PMCAR with shared Obidsetstrategy is shown in Fig. 10. The algorithm first finds all frequent 1-itemsets and adds them to the root node (Line 1). It then calls pro-cedure PMCAR to generate frequent 2-itemsets (Lines 2–14). Foreach branch of the tree, it creates a task and call procedure Sub-PMCAR inside that task (Line 15). Sub-PMCAR is recursively calledto generate frequent k-itemsets (k > 2) and their rules (Lines 20–34). We can see that the functions of procedures PMCAR andSub-PMCAR look like those mentioned in PMCAR with independentbranch strategy. However, this algorithm provides a more compli-cated parallel strategy. In Sub-PMCAR, the algorithm creates a listof child tasks to parallel intersect Obidseti of two nodes (Lines 27–

28). This allows the work distribution to be the most fine-grained.Nevertheless, all child tasks have to finish their work before calcu-lating two properties pos and total for the new node (Lines 29–31).Consequently, there is a high cost of synchronization among childtasks and between child tasks and their parent task.

Let us illustrate the basic ideas of shared Obidset strategy byFig. 11. Branch a1 is assigned to task t1. In procedure Sub-PMCAR,tasks t2 and t3 which are child tasks of t1 process together node3 � a1b1(1, 2), i.e., tasks t2 and t3 parallel intersect Obidset1 andObidset2 of two nodes 3 � a1b1(1, 2) and 5 � a1c1(1, 2), respec-tively. However, task t2 must wait till task t3 finishing the intersec-tion of two Obidset2 to obtain Obidset1 and Obidset2 of the new node7 � a1b1c1(1, 2). Additionally, parent task t1 represented by thesolid block must wait till all tasks t2, t3, and other child tasks fin-ishing their work.

6. Time complexity analysis

In this section, we analyze the time complexities of bothsequential and proposed parallel CAR mining algorithms. We thenderive the speedup of the parallel algorithm. We also compare thetime complexity of our parallel algorithm with those of existingparallel algorithms.

Page 9: Efficient strategies for parallel mining class association rules

Input: Dataset ,D minSup and minConf

Output: All CARs satisfying minSup and minConf

Procedure:

1. Let rL be the root node of the tree. rL includes a set of nodes in which each node contains a

frequent 1-itemset.

PMCAR( ,rL minSup, minConf)

2. totalCARs=CARs=∅ ;

3. for all .childrenx rl L∈ do

4. Generate-Rule(CARs, xl , minConf);

5. iP = ∅ ;

6. for all .childreny rl L∈ , with y x> do

7. if . .y xl att l att≠ then // two nodes are combined only if their attributes are different

8. . . | .x yO att l att l att= ; // using bitwise operation

9. . . .x yO values l values l values= ∪ ;

10. . . .i x i y iO Obidset l Obidset l Obidset= ∩ ; // [ ]1,i k∀ ∈

11. [ ] { }1,. argmax . ii kO pos OObidset∈= ;

12.1

. .k

ii

O total OObidset=

= ∑ ;

13. if .. O posOObidset ≥minSup then // node O satisfies minSup

14. i iP P O= ∪ ;

15. Task it = new Task(() => {

Sub-PMCAR(tCARs, ,iP minSup, minConf); });

16. for each task in the list of created tasks do

17. collect the set of rules ( tCARs ) returned by each task;

18. totalCARs totalCARs tCARs= ∪ ;

19. totalCARs totalCARs CARs= ∪ ;

Sub-PMCAR(tCARs, ,rL minSup, minConf)

20. for all .childrenx rl L∈ do

21. Generate-Rule(tCARs, xl , minConf);

22. iP = ∅ ;

23. for all .childreny rl L∈ , with y x> do

24. if . .y xl att l att≠ then // two nodes are combined only if their attributes are different

25. . . | .x yO att l att l att= ; // using bitwise operation

26. . . .x yO values l values l values= ∪ ;

27. for i = 1 to k do // k is the number of classes

28. Task ichild = new Task(() => {

. . .i x i y iO Obidset l Obidset l Obidset= ∩ ; });

29. Task.WaitAll( ichild );

30. [ ] { }1,. argmax . ii kO pos OObidset∈= ;

31.1

. .k

ii

O total OObidset=

= ∑ ;

32. if .. O posOObidset ≥minSup then // node O satisfies minSup

33. i iP P O= ∪ ;

34. Sub-PMCAR(tCARs, ,iP minSup, minConf);

Fig. 10. PMCAR with shared Obidset strategy.

4724 D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729

We can see that the sequential CAR mining algorithm describedin Section 4 scans the dataset once and uses a main loop to mine all

CARs. Based on the cost model in Skillicorn (1999), the time com-plexity of this algorithm is:

Page 10: Efficient strategies for parallel mining class association rules

{ }

( )1 1 1,2a× ( )1 2 ,3a× ∅ ( )2 1 1,23b× ( )4 1 1,23c×

( )3 1 1 1,2a b× ( )5 1 1 1,2a c× ( )3 2 1 ,3a b× ∅ ( )5 2 1 ,3a c× ∅ ( )6 1 1 1,23b c×

( )7 1 1 1 1,2a b c× ( )7 2 1 1 ,3a b c× ∅t1

t2,t3

Fig. 11. Illustration of shared Obidset strategy.

Table 2Characteristics of the experimental datasets.

Dataset # Attributes # Classes # Distinctive values # Objects

Poker-hand 11 10 95 1,000,000Chess 37 2 76 3196Connect-4 43 3 130 67,557Pumsb 74 5 2113 49,046

D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729 4725

TS ¼ kS �mþ a

where TS is the execution time of the sequential CAR mining algo-rithm, kS is the number of iterations in the main loop, m is the exe-cution time of generating nodes and rules in each iteration, and a isthe execution time of accessing dataset.

The proposed parallel algorithm distributes node and rule gen-erations to multiple tasks executed on multi-cores. Thus, the exe-cution time of generating nodes and rules in each iteration is m

t�c,where t is the number of tasks and c is the number of cores. Thetime complexity of the parallel algorithm is:

TP ¼ kP �m

t � cþ a

where TP is the execution time of the proposed parallel CAR miningalgorithm, kP is the number of iterations in the main loop.

The speedup is thus:

Sp ¼ TS

TP¼ kS �mþ a

kP � mt�c þ a

In our experiments, the execution time of the sequential code (forexample, the code to scan the dataset) is very small. In addition,the number of iterations in the main loop in both sequential andparallel algorithms is similar. Therefore, the speedup equation canbe simplified as follows:

Sp ¼ kS �mþ akP � m

t�c þ a� kS �m

kP � mt�c

� mm

t�c

¼ t � c

Thus, we can achieve up to a t � c speedup over the sequentialalgorithm.

Now we analyze the time complexity of the parallel CBA algo-rithm proposed in Thakur and Ramesh (2008). Since this algorithmis based on the Apriori algorithm, it must scan the dataset manytimes. Additionally, this algorithm was employed on a distributedmemory system which means that it needs an additional computa-tion time for communication and information exchange amongnodes. Consequently, the time complexity of this algorithm is:

TC ¼ kC �mpþ aþ d

� �

where TC is the execution time of the parallel CBA algorithm, kC isthe number of iterations required by the parallel CBA algorithm, pis the number of processors, and d is the execution time for commu-nication and data exchange among computing nodes.

Assume that kP � kC and t � c � p. We have:

TC ¼ kC �mpþ a

� �þ ðkC � 1Þ � aþ kC � d

� TP þ ðkC � 1Þ � aþ kC � d

Obviously, TP < TC which implies that our proposed algorithm is fas-ter than the parallel version for CBA in theory.

Similarly, the time complexity of the parallel FP-Growth algo-rithm proposed in Mokeddem and Belbachir (2010) is as follows:

TF ¼ kF �mpþ d

� �þ a

where TF is the execution time of the parallel FP-Growth algorithm,kF is the number of iterations required by the parallel FP-Growthalgorithm.

The parallel FP-Growth scans the dataset once and then parti-tions it into P parts regarding the number of processors. Each pro-cessor scans its local data partition to count the local support ofeach item. Therefore, the execution time of accessing the datasetin this algorithm is only a. However, computing nodes need tobroadcast the local support of each item across the group so thateach processor can calculate the global count. Thus, this algorithmalso needs an additional computation time d for data transfer.

Assume that kP � kF and t � c � p. We have:

TF ¼ kF �mpþ a

� �þ kF � d � TP þ kF � d

It can conclude that our proposed parallel algorithm is also fasterthan the parallel FP-Growth algorithm in theory and TP < TF < TC.

7. Experimental results

This section provides the results of our experiments includingthe testing environment, the results of the scalability experimentsof three proposed parallel strategies, and the performance of theproposed parallel algorithm with variation on the number of ob-jects and attributes. It finally compares the execution time ofPMCAR with that of the recent sequential CAR mining algorithm,CAR-Miner (Nguyen et al., 2013).

7.1. Testing environment

All experiments were conducted on a multi-core processorcomputer which has one Intel i7-2600 processor. The processorhas 4 cores and an 8 MB L3-cache, runs at a core frequency of3.4 GHz, and also supports Hyper-threading. The computer has4 GB of memory and runs OS Windows 7 Enterprise (64-bit) SP1.The algorithms were coded in C# by using MS Visual Studio .NET2010 Express. The parallel algorithm was implemented based onthe parallelism model supported in Microsoft .NET Framework4.0 (version 4.0.30319).

The experimental datasets were obtained from the University ofCalifornia Irvine (UCI) Machine Learning Repository (http://mlear-n.ics.uci.edu) and the Frequent Itemset Mining (FIM) DatasetRepository (http://fimi.ua.ac.be/data/). The four datasets used inthe experiments are Poker-hand, Chess, Connect-4, and Pumsbwith the characteristics shown in Table 2. The table shows thenumber of attributes (including the class attribute), the numberof class labels, the number of distinctive values (i.e., the total num-ber of distinct values in all attributes), and the number of objects(or records) in each dataset. The Chess, Connect-4, and Pumsbdatasets are dense and have many attributes whereas the Poker-hand dataset is sparse and has few attributes.

7.2. Scalability experiments

We evaluated the scalability of PMCAR by running it on thecomputer that had been configured to utilize a different number

Page 11: Efficient strategies for parallel mining class association rules

Table 3Characteristics of the synthetic datasets.

Dataset #Attributes

#Classes

Density #Objects

File size(KB)

C50R100KD55 50 2 55 100,000 9961C50R200KD55 50 2 55 200,000 19,992C50R300KD55 50 2 55 300,000 29,883C50R400KD55 50 2 55 400,000 39,844C50R500KD55 50 2 55 500,000 49,805C10R500KD55 10 2 55 500,000 10,743C20R500KD55 20 2 55 500,000 20,508C30R500KD55 30 2 55 500,000 30,247C40R500KD55 40 2 55 500,000 40,040

0

50,000

100,000

150,000

200,000

250,000

300,000

350,000

400,000

450,000

500,000

0

200

400

600

800

1,000

1,200

1,400

1,600

1,800

10 20 30 40

# C

AR

s

Run

tim

e (i

n se

cond

s)

# Attributes

# Objects = 500K, Density = 55%, minSup = 50%

# CARsCAR-MinerPMCAR-Shared BranchPMCAR-Independent BranchPMCAR-Shared Obidset

Fig. 14. Performance comparison between PMCAR and CAR-Miner with variationon the number of attributes. Other parameters are set to: # Objects = 500 K,Density = 55% and minSup = 50%.

0

10,000

20,000

30,000

40,000

50,000

60,000

70,000

80,000

0

50

100

150

200

250

300

350

100K 200K 300K 400K 500K

# C

AR

s

Run

tim

e (i

n se

cond

s)

# Objects

# Attributes = 50, Density = 55%, minSup = 70%

# CARsCAR-MinerPMCAR-Shared BranchPMCAR-Independent BranchPMCAR-Shared Obidset

Fig. 13. Performance comparison between PMCAR and CAR-Miner with variationon the number of objects. Other parameters are set to: # Attributes = 50,Density = 55% and minSup = 70%.

(a) Scalability of PMCAR for the Poker-hand dataset (minSup = 0.01%)

(b) Scalability of PMCAR for the Chess dataset (minSup = 30%)

(c) Scalability of PMCAR for the Connect-4 dataset (minSup = 80%)

(d) Scalability of PMCAR for the Pumsb dataset (minSup = 70%)

0.0

0.5

1.0

1.5

2.0

2.5

1 2 4

Spee

dups

# cores

Poker-hand

CAR-MinerPMCAR-Shared BranchPMCAR-Independent Branch

0.0

0.5

1.0

1.5

1 2 4

Spee

dups

# cores

Chess

CAR-MinerPMCAR-Shared BranchPMCAR-Independent Branch

0.0

0.5

1.0

1.5

2.0

2.5

1 2 4

Spee

dups

# cores

Connect-4

CAR-MinerPMCAR-Shared BranchPMCAR-Independent Branch

0.0

0.5

1.0

1.5

1 2 4

Spee

dups

# cores

Pumsb

CAR-MinerPMCAR-Shared BranchPMCAR-Independent Branch

Fig. 12. Speedup performance of PMCAR with two parallel strategies.

4726 D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729

of cores. The configuration was adjusted in the BIOS Setup. Thenumber of supported cores was setup at 1, 2, and 4 core(s) in turn.The performance of PMCAR and CAR-Miner were compared. Weobserved that the performances of CAR-Miner were nearly identi-cal when it was run on the computer utilized a various numberof cores. It can be said that the sequential algorithms cannot takethe advantages of the multi-core processor architecture. In thecontrary, PMCAR scaled much better than CAR-Miner when thenumber of running cores was increased. In the experiments, weused the runtime performance of CAR-Miner to be the baselinefor obtaining the speedups. Fig. 12(a)–(d) illustrate the speedupperformance of PMCAR with two parallel strategies for the Poker-hand, Chess, Connect-4, and Pumsb datasets, respectively. Notethat minConf = 50% was used for all experiments.

Page 12: Efficient strategies for parallel mining class association rules

(a) # CARs produced (b) Runtime for PMCAR and CAR-Miner

-

50,000

100,000

150,000

200,000

250,000

300,000

350,000

400,000

0.11 0.09 0.07 0.05 0.03 0.01

# C

AR

s

minSup (%)

Poker-hand

05

101520253035404550

0.11 0.09 0.07 0.05 0.03 0.01

Run

tim

e (i

n se

cond

s)

minSup (%)

Poker-hand

CAR-MinerPMCAR-Shared BranchPMCAR-Independent Branch

Fig. 15. Comparative results between PMCAR and CAR-Miner for the Poker-hand dataset with various minSup values.

D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729 4727

Obviously, PMCAR is slower than the sequential algorithm CAR-Miner when they are executed on a single core because task is pro-cessor-intensive. This situation is known as the processor oversub-scription. However, when the number of using cores is increased,PMCAR is much faster than CAR-Miner. As shown in Fig. 12(c) forthe Connect-4 dataset, PMCAR with two independent branch andshared branch strategies archives speedups up to 2.1� and 1.4�,respectively. Interestingly, the shared branch strategy is not benefi-cial for the Chess dataset. Fig. 12(b) shows that PMCAR with sharedbranch is always slower than the sequential CAR-Miner. As dis-cussed before, the shared branch strategy has high synchronizationcost occurred between tasks. As a result, the huge number of tasks(4,253,728 tasks) generated for the Chess dataset with min-Sup = 30% reduces significantly the runtime performance. We alsoconducted the scalability experiments for the shared Obidset strat-egy. It, however, did not obtain good scalability results because ofthe high costs of synchronization among child tasks and betweenchild tasks and their parent task. Therefore, we did not show itsperformance on the charts.

1 Executable files of the CAR-Miner and PMCAR algorithms and experimentadatasets can be downloaded from http://goo.gl/hIrDtl.

7.3. Influence of the number of dimensions and the size of dataset

To obtain a clear understanding on how PMCAR is affected bythe dataset dimension and size, we conducted experiments on syn-thetic datasets with a various number of attributes and objects.Based on the ideas from (Coenen, 2007), we developed a tool forgenerating a synthetic dataset. Firstly, we fixed other parametersof the dataset generator as follows: (1) the number of attributesis 50; (2) the density is 55%. We then generated test datasets witha different number of objects which ranged between 100,000 and500,000. Secondly, we fixed the number of objects and the densityfor 500,000 and 55%, respectively. We then generated datasetswith a various number of attributes in a range between 10 and40. The details of synthetic datasets are shown in Table 3.

Fig. 13 illustrates the performance results with respect to thenumber of objects in the dataset. As shown, PMCAR achieves agood result with the dataset size compared to CAR-Miner. Forexample, when the dataset size reaches to 500 K, PMCAR withshared branch strategy is up to 1.6� compared to CAR-Miner. How-ever, two other strategies, independent branch and shared Obidset,failed to execute their operation at the dataset size 500 K due tothe memory leak. This problem happens because each task in thesestrategies holds the entire branch of the tree which consumes veryhigh memory on dense datasets.

Fig. 14 demonstrates the performance results with respect tothe number of attributes in the dataset. Again, PMCAR achieves agood result with the dataset dimension compared to the sequentialalgorithm. For instance, the execution time of PMCAR with sharedbranch was only 1,003.694 s while CAR-Miner was 1,572.754 s

when the number of dimension was 40. However, two strategies,independent branch and shared Obidset, failed to execute at thedataset dimension 40 due to the memory leak.

7.4. Comparison with sequential algorithms1

In this section, we compare the execution time of PMCAR withthe sequential algorithm CAR-Miner. These experiments aim toshow that PMCAR is competitive with the existing algorithm.Figs. 15–18 show the number of generated CARs and the executiontimes of PMCAR and CAR-Miner for Poker-hand, Chess, Connect-4,and Pumsb datasets with various minSup values on the computerconfigured to utilize 4 cores and enable Hyper-threading. It canbe observed that CAR-Miner performs badly except for the Chessdataset. It is slower than PMCAR because it cannot utilize the com-puting power of the multi-core processor. On the contrary, PMCARis optimized for parallel mining the dataset; thus its performanceis superior to CAR-Miner. PMCAR with the independent branchstrategy is always the fastest of all tested algorithms. For example,consider the Connect-4 dataset with minSup = 65%. Independentbranch consumed only 1,776.892 s to finish its work while sharedObidset, shared branch, and CAR-Miner consumed 1,924.081s,2,477.279s and 2,772.470s, respectively. The runtime perfor-mances of shared Obidset were worst on the Poker-hand, Chess,and Pumsb datasets. Thus, we did not show them on the charts.

8. Conclusions and future work

In this paper, we have proposed three strategies for parallelmining class association rules on the multi-core processor archi-tecture. Unlike sequential CAR mining algorithms, our parallelalgorithm distributes the process of generating frequent itemsetsand rules to multiple tasks executed on multi-cores. The frame-work of the proposed method is based on our previous sequentialCAR mining method and three parallel strategies independentbranch, shared branch, and shared Obidset. The time complexitiesof both sequential and parallel CAR mining algorithms have beenanalyzed, with results showing the good effect of the proposedalgorithm. The speedup can be achieved up to t � c in theory. Wehave also theoretically proven that the execution time of our par-allel CAR mining algorithm is faster than those of existing parallelCAR mining algorithms. Additionally, a series of experiments havebeen conducted on both real and synthetic datasets. The experi-mental results have also shown that three proposed parallel meth-ods are competitive with the sequential CAR mining method.However, the first and third strategies currently consume higher

l

Page 13: Efficient strategies for parallel mining class association rules

(a) # CARs produced (b) Runtime for PMCAR and CAR-Miner

- 500,000

1,000,000 1,500,000 2,000,000 2,500,000 3,000,000 3,500,000 4,000,000 4,500,000

90 85 80 75 70 65

# C

AR

s

minSup (%)

Connect-4

0

500

1,000

1,500

2,000

2,500

3,000

90 85 80 75 70 65

Run

tim

e (i

n se

cond

s)

minSup (%)

Connect-4

CAR-MinerPMCAR-Shared BranchPMCAR-Independent BranchPMCAR-Shared Obidset

Fig. 17. Comparative results between PMCAR and CAR-Miner for the Connect-4 dataset with various minSup values.

(a) # CARs produced (b) Runtime for PMCAR and CAR-Miner

- 500,000

1,000,000 1,500,000 2,000,000 2,500,000 3,000,000 3,500,000 4,000,000 4,500,000

55 50 45 40 35 30

# C

AR

s

minSup (%)

Chess

0102030405060708090

100

55 50 45 40 35 30

Run

tim

e (i

n se

cond

s)

minSup (%)

Chess

CAR-MinerPMCAR-Shared BranchPMCAR-Independent Branch

Fig. 16. Comparative results between PMCAR and CAR-Miner for the Chess dataset with various minSup values.

(a) # CARs produced (b) Runtime for PMCAR and CAR-Miner

- 200,000 400,000 600,000 800,000

1,000,000 1,200,000 1,400,000 1,600,000 1,800,000

90 85 80 75 70 65

# C

AR

s

minSup (%)

Pumsb

0100200300400500600700800900

1,000

90 85 80 75 70 65

Run

tim

e (i

n se

cond

s)

minSup (%)

Pumsb

CAR-MinerPMCAR-Shared BranchPMCAR-Independent Branch

Fig. 18. Comparative results between PMCAR and CAR-Miner for the Pumsb dataset with various minSup values.

4728 D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729

memory than the sequential counterpart which causes their inabil-ity to cope with very dense datasets. Thus, we will research how toreduce the memory consumption of these strategies in the future.We will also investigate the applicability of the proposed methodson other platforms such as multiple graphic processors or clouds.

Acknowledgements

This work was funded by Vietnam’s National Foundation forScience and Technology Development (NAFOSTED) under GrantNo. 102.01-2012.17.

References

Agrawal, R., & Shafer, J. (1996). Parallel mining of association rules. IEEE Transactionson Knowledge and Data Engineering, 8, 962–969.

Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules inlarge databases. In The 20th International Conference on Very Large Data Bases(pp. 487–499). Morgan Kaufmann Publishers Inc..

Andrew, B. (2008). Multi-Core Processor Architecture Explained. In http://software.intel.com/en-us/articles/multi-core-processor-architecture-explained: Intel.

Baralis, E., Chiusano, S., & Garza, P. (2004). On support thresholds in associativeclassification. In The 2004 ACM Symposium on Applied Computing (pp. 553–558).ACM.

Page 14: Efficient strategies for parallel mining class association rules

D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729 4729

Baralis, E., Chiusano, S., & Garza, P. (2008). A lazy approach to associativeclassification. IEEE Transactions on Knowledge and Data Engineering, 20, 156–171.

Cagliero, L., & Garza, P. (2013). Improving classification models with taxonomyinformation. Data & Knowledge Engineering, 86, 85–101.

Casali, A., & Ernst, C. (2013). Extracting correlated patterns on multicorearchitectures. Availability, Reliability, and Security in Information Systems andHCI (Vol. 8127, pp. 118–133). Springer.

Chen, W.-C., Hsu, C.-C., & Hsu, J.-N. (2012). Adjusting and generalizing CBAalgorithm to handling class imbalance. Expert Systems with Applications, 39,5907–5919.

Chen, G., Liu, H., Yu, L., Wei, Q., & Zhang, X. (2006). A new approach to classificationbased on association rule mining. Decision Support Systems, 42, 674–689.

Coenen, F. (2007). Test set generator (version 3.2). In http://cgi.csc.liv.ac.uk/~frans/KDD/Software/LUCS-KDD-DataGen/generator.html.

Han, J., Pei, J., & Yin, Y. (2000). Mining frequent patterns without candidategeneration. ACM SIGMOD Record (Vol. 29, pp. 1–12). ACM.

Laurent, A., Négrevergne, B., Sicard, N., & Termier, A. (2012). Efficient parallelmining of gradual patterns on multicore processors. Advances in KnowledgeDiscovery and Management (Vol. 398, pp. 137–151). Springer.

Li, W., Han, J., & Pei, J. (2001). CMAR: Accurate and efficient classification based onmultiple class-association rules. In IEEE International Conference on Data Mining(ICDM 2001) (pp. 369–376). IEEE.

Liu, B., Hsu, W., & Ma, Y. (1998). Integrating classification and association rulemining. In The 4th International Conference on Knowledge Discovery and DataMining (KDD 1998) (pp. 80–86).

Liu, L., Li, E., Zhang, Y., & Tang, Z. (2007). Optimization of frequent itemset mining onmultiple-core processor. In The 33rd International Conference on Very Large DataBases (pp. 1275–1285). VLDB Endowment.

Liu, H., Liu, L., & Zhang, H. (2011). A fast pruning redundant rule method usingGalois connection. Applied Soft Computing, 11, 130–137.

Mokeddem, D., & Belbachir, H. (2010). A distributed associative classificationalgorithm. Intelligent Distributed Computing IV (Vol. 315, pp. 109–118). Springer.

Negrevergne, B., Termier, A., Méhaut, J.-F., & Uno, T. (2010). Discovering closedfrequent itemsets on multicore: Parallelizing computations and optimizingmemory accesses. In International Conference on High Performance Computingand Simulation (HPCS 2010) (pp. 521–528). IEEE.

Negrevergne, B., Termier, A., Rousset, M.-C., & Méhaut, J.-F. (2013). Para Miner: ageneric pattern mining algorithm for multi-core architectures. Data Mining andKnowledge Discovery, 1–41.

Netzer, R., & Miller, B. (1989). Detecting data races in parallel program executions.University of Wisconsin-Madison.

Nguyen, D., & Vo, B. (2014). Mining class-association rules with constraints.Knowledge and Systems Engineering (Vol. 245, pp. 307–318). Springer.

Nguyen, L. T., Vo, B., Hong, T.-P., & Thanh, H. C. (2012). Classification based onassociation rules: A lattice-based approach. Expert Systems with Applications, 39,11357–11366.

Nguyen, L. T., Vo, B., Hong, T.-P., & Thanh, H. C. (2013). CAR-Miner: An efficientalgorithm for mining class-association rules. Expert Systems with Applications,40, 2305–2311.

Quinlan, J. R. (1993). C4. 5: programs for machine learning. Morgan KaufmannPublishers Inc..

Schlegel, B., Karnagel, T., Kiefer, T., & Lehner, W. (2013). Scalable frequent itemsetmining on many-core processors. In The 9th International Workshop on DataManagement on New Hardware. ACM. Article No. 3.

Skillicorn, D. (1999). Strategies for parallel data mining. IEEE Concurrency, 7, 26–35.Tatikonda, S., & Parthasarathy, S. (2009). Mining tree-structured data on multicore

systems. Proceedings of the VLDB Endowment, 2, 694–705.Thabtah, F., Cowling, P., & Peng, Y. (2004). MMAC: A new multi-class, multi-label

associative classification approach. In The 4th IEEE International Conference onData Mining (ICDM 2004) (pp. 217–224). IEEE.

Thabtah, F., Cowling, P., & Peng, Y. (2005). MCAR: multi-class classification based onassociation rule. In The 3rd ACS/IEEE international conference on computer systemsand applications (pp. 33–39). IEEE.

Thakur, G., & Ramesh, C. J. (2008). A framework for fast classification algorithms.International Journal Information Theories and Applications, 15, 363–369.

Tolun, M., & Abu-Soud, S. (1998). ILA: An inductive learning algorithm for ruleextraction. Expert Systems with Applications, 14, 361–370.

Tolun, M., Sever, H., Uludag, M., & Abu-Soud, S. (1999). ILA-2: An inductive learningalgorithm for knowledge discovery. Cybernetics & Systems, 30, 609–628.

Vo, B., & Le, B. (2009). A novel classification algorithm based on association rulesmining. Knowledge Acquisition: Approaches, Algorithms and Applications (Vol.5465, pp. 61–75). Springer.

Xu, X., Han, G., & Min, H. (2004). A novel algorithm for associative classification ofimage blocks. In The 4th International Conference on Computer and InformationTechnology (CIT 20 04) (pp. 46–51). IEEE.

Yin, X., & Han, J. (2003). CPAR: Classification based on predictive association rules.The 3rd SIAM International Conference on Data Mining (SDM 2003) (Vol. 3,pp. 331–335). SIAM.

Yu, K.-M., & Wu, S.-H. (2011). An efficient load balancing multi-core frequentpatterns mining algorithm. In The IEEE 10th International Conference on Trust,Security and Privacy in Computing and Communications (TrustCom 2011)(pp. 1408–1412). IEEE.

Zaki, M., Parthasarathy, S., Ogihara, M., & Li, W. (1997). New algorithms for fastdiscovery of association rules. In The 3rd international conference on knowledgediscovery and data mining (Vol. 20, pp. 283–286).

Zhao, M., Cheng, X., & He, Q. (2009). An algorithm of mining class association rules.Advances in Computation and Intelligence (Vol. 5821, pp. 269–275). Springer.