Data Mining Separating fact from fiction
Tony HirstDepartment of Communication
and SystemsThe Open University
@psychemedia
KnowledgeDiscovery inDatabases
Pattern &
Structure
Pattern &
Structure
Seasonal Subseries
Emer
gen
t So
cial
Po
siti
on
ing
@MediaWeek
[Correlation & concordance]
Concordance plot…
Lexical dispersion plot…
Pattern &
Structure
IdentityP
rob
lem
s o
f…
Partial (Fuzzy) String
Matching
Not What You Thought?
Cau
tio
nar
y Ex
amp
les
Ansocobe’s Quartet
Simpson’s Paradox
Looking for Anomalies - Funnel plots
Ethics &Value
JudgementsAlgorithms have biases and prejudices too…
Predictivevs
DescriptiveTasks
ClassificationRegression
Pre
dic
tive
Tas
ks
ClusteringAnomaly Detection
Decision TreesRule Discovery
Des
crip
tive
Tas
ks
Pre
dic
tive
Tas
ks
k-Nearest Neighbour
[Supervised Training Task]
Des
crip
tive
Tas
ks
k-meansclustering
[Unsupervised Training Task]
Adaptive Algorithms
TM351Data Management & Analysis