C OMPARING A SSOCIATION R ULES AND D ECISION T REES FOR D ISEASE P REDICTION Carlos Ordonez.
-
Upload
stephen-chase -
Category
Documents
-
view
216 -
download
0
Transcript of C OMPARING A SSOCIATION R ULES AND D ECISION T REES FOR D ISEASE P REDICTION Carlos Ordonez.
COMPARING ASSOCIATION RULES AND DECISION TREES FOR DISEASE PREDICTIONCarlos Ordonez
MOTIVATION
Three main issues about mining association rules in medical datasets:
1. A significant fraction of association rules is irrelevant
2. Most relevant rules with high quality metrics appear only at low support
3. # of discovered rules becomes extremely large at low support
Search constraints: Find only medically significant association
rules Make search more efficient
MOTIVATION
Decision tree a well-known machine learning algorithm
Association rules vs. Decision tree Accuracy Interpretability Applicability
ASSOCIATION RULES
Support Confidence Lift
Lift quantifies the predictive power of x y Rules such that lift(xy) > 1 are interesting!
)(sup
)(sup)(
xport
yxportyxconfidence
)(sup
)()(
yport
yxconfidenceyxlift
CONSTRAINED ASSOCIATION RULES
Transforming Medical Data Set Data must be transformed to binary dimensions
Numeric attributes intervals, each interval is mapped to an item.
Categorical attributes each categorical value is an item
If an attribute has negation add that as an item
Each item is corresponds to the presence or absence of one categorical value or one numeric interval
CONSTRAINED ASSOCIATION RULES
Search Constraints1. Max itemset size (k)
Reduces the combinatorial explosion of large itemsets and helps finding simple rules
2. Group gi >0 Aj belongs to a group
gi =0 Aj is not group-constrained at all This avoids finding trivial or redundant rules
3. Antecedent/Consequentci = 1 Ai is an antecedent
ci = 2 Ai is a consequent
Patients 655
attributes 25
Percentage of vessel narrowing
LAD, LCX and RCA are binned at 70% and 50%LM is binned at 30% and 50%
9 heart regions ( 2 ranges with 0.2 as cutoff)
Binned at 40(adult) and 60(old)
Binned at 200 and 250
PARAMETERS
k = 4 Min support = 1% ≈ 7 Min confidence = 70% Min lift = 1.2
To get rules where there is stronger implication dependence between X and Y
Rules with conf ≥ 90 and lift ≥ 2, with 2 or more items in the consequent were considered medically significant.
HEALTHY ARTERIES
9,595 associations 771 rules
DISEASED ARTERIES
Several unneeded itemswere filtered out ( with values in lower (healthy)ranges)
10,218 associations 552 rules
PREDICTIVE RULES FROM DECISION TREES
CN4.5 using gain ratio CART similar results Threshold for the height of the tree to
produce simple rules Percentage of patients (ls)
Fraction of patients where the antecedent appears
Confidence factor (cf) Focus on predicting LDA disease
PREDICTIVE RULES FROM DECISION TREES
1. All measurements without binning as independent variables, numerical variables are automatically split
Without any threshold on height: 181 node 90% accuracy height = 14 most rules more than 5 attributes except 5 rules, other involve less than 2% of the patients More than 80% of rules refer to less than 1% of patients Many rules involve attributes with missing information Many rules had the same variable being split several
times Few rules with cf = 1 but splits included borderline cases
and involves few patients
PREDICTIVE RULES FROM DECISION TREES
With threshold = 10 on height 83 nodes 77% accuracy Most rules have repeated attributes More than 5 attributes Perfusion cutoffs higher than 0.5 Low cf and involved less than 1% of the population
With threshold = 3 on height 65% accuracy Simpler rules
RELATED WORK
PREDICTIVE RULES FROM DECISION TREES
2. Items (binary variables) as independent variables like association rules are used
With threshold = 3 on height Most of the rules were much closer to the prediction
requirements 10 nodes
DISCUSSION
Decision trees are not as powerful as association rules in this
case Do not work well with combinations of several
target variables Fail to identify many medically relevant
combinations of independent numeric variable ranges and categorical values
Tend to find complex and long rules, if the height is unlimited
Find few predictive rules with reasonably sized (>1%) sets of patients in such cases
Rules some times repeat the same attribute
DISCUSSION - ALTERNATIVES
build many decision trees with different independent attributes It’s error-prone, difficult to interpret, slow for
higher # of attributes Create a family of small trees, each tree has
a weight Each tree becomes similar to a small set of
association rules Constraints for association rules can be
adopted to decision trees (future work)
DISCUSSION – DECISION TREE ADVANTAGES
DT partitions the data set, ARs on the same target attributes may refer to overlap
DT represents a predictive model of data set, ARs are disconnected among themselves
DT is guaranteed to have at least 50% prediction accuracy and generally above 80% for binary target variables, ARs require trial and error to find the best threshold