01. IntroductionOverview of Data Mining
-
Upload
febrian-arsy -
Category
Documents
-
view
215 -
download
0
Transcript of 01. IntroductionOverview of Data Mining
-
8/20/2019 01. IntroductionOverview of Data Mining
1/32
-
8/20/2019 01. IntroductionOverview of Data Mining
2/32
Introduction of Data Mining
Session 01
Matakuliah : M0824-Data Mining Tahun : Sep - 2011
-
8/20/2019 01. IntroductionOverview of Data Mining
3/32
Bina NusantaraUni ersit! "
Learning Outcomes#$plain data %ining concepts and techni&ues'
-
8/20/2019 01. IntroductionOverview of Data Mining
4/32
Bina Nusantara
Acknowledgments
These slides have been adaptedfrom Han, J., Kamber, ., !
"ei, #. $%&&'(. )ata ining*+oncepts and Techni ue. -disi%. organ Kaufman. an/rancisco.
-
8/20/2019 01. IntroductionOverview of Data Mining
5/32
Bina Nusantara
( 0hat otivated )ata ining1( )ata ining 2 On 0hat Kind of )ata1
( )ata ining /unctionalities( +lassi3cation of )ata ining 4stems( 5ntegration of a )ata ining 4stems with a
)atabase or )ata 0arehouse 4stems
( a6or 5ssues in )ata ining
Outline ateri
5
-
8/20/2019 01. IntroductionOverview of Data Mining
6/32
0hat otivated )ata
ining1
Bina NusantaraUni ersit! )
-
8/20/2019 01. IntroductionOverview of Data Mining
7/32
0h4 )ata ining1( The #$plosi e *ro+th of Data: fro% tera,!tes to peta,!tes
Data collection and data a aila,ilit!
( .uto%ated data collection tools/ data,ase s!ste%s/ e,/ co%puteri edsociet!
Ma or sources of a,undant data
( Business: e,/ e-co%%erce/ transactions/ stocks/ 3
( Science: 4e%ote sensing/ ,ioinfor%atics/ scienti5c si%ulation/ 3
( Societ! and e er!one: ne+s/ digital ca%eras/ 6ouTu,e
( e are dro+ning in data/ ,ut star ing for kno+ledge7
( 8Necessit! is the %other of in ention9 Data %ining .uto%ated anal!sis of %assi edata sets
Bina Nusantara Uni ersit!
;
-
8/20/2019 01. IntroductionOverview of Data Mining
8/32
0hat 5s )ata ining1( Data %ining
-
8/20/2019 01. IntroductionOverview of Data Mining
9/32
March )/ 201)
Data Mining: oncepts and Techni&ues C
?no+ledge Disco er!
-
8/20/2019 01. IntroductionOverview of Data Mining
10/32
0hat is $not( )ata ining1 What is Data Mining?
– Certain names are more prevalent incertain US locations ( !"rien# !$%r&e#!$eill' in "oston area)
– *ro%p together similar +oc%mentsret%rne+ ,' search engine accor+ing to
their conte t (e.g. /ma on rain1orest# /ma on.com#)
What is not Data Mining?
– oo& %p phone
n%m,er in phone+irector'
– 3%er' a We, searchengine 1or in1ormation
a,o%t /ma on
-
8/20/2019 01. IntroductionOverview of Data Mining
11/32
March )/ 201)
Data Mining: oncepts and Techni&ues 11
Data Mining in Business Intelligence Increasing potentialto supportbusiness decisions End User
Business nal!st
Datanal!st
DB
Decision Making
Data PresentationVisualization Techniques
Data Mining Information Discovery
Data Exploration Statistical Summary, Querying, and Reporting
Data Preprocessing/Integration, Data Warehouses
Data Sources Paper, iles, !e" documents, Scientific e#periments, Data"ase Systems
-
8/20/2019 01. IntroductionOverview of Data Mining
12/32
Data Mining Tasks'''( lassi5cation E redicti eF( lustering EDescripti eF( .ssociation 4ule Disco er! EDescripti eF
( Se&uential attern Disco er! EDescripti eF( 4egression E redicti eF( De iation Detection E redicti eF
-
8/20/2019 01. IntroductionOverview of Data Mining
13/32
)ata ining 2 On 0hat
Kind of )ata1
Bina NusantaraUni ersit! 1"
-
8/20/2019 01. IntroductionOverview of Data Mining
14/32
March )/ 201)
Data Mining: oncepts and Techni&ues 1G
Data Mining Hunction:
-
8/20/2019 01. IntroductionOverview of Data Mining
15/32
March )/ 201)
Data Mining: oncepts and Techni&ues 1K
Data Mining Hunction:
( o+ to use such patterns for classi5cation/ clustering/and other applications>
-
8/20/2019 01. IntroductionOverview of Data Mining
16/32
.ssociation 4ule Disco er!: .pplication
1( Marketing and Sales ro%otion:Jet the rule disco ered ,e {Bagels, … } --> {Potato Chips}
otato hips as conse&uent OP an ,e used to
deter%ine +hat should ,e done to ,oost its sales'Bagels in the antecedent OP an ,e used to see +hichproducts +ould ,e aQected if the store discontinuesselling ,agels'Bagels in antecedent and otato chips in conse&uent OP
an ,e used to see +hat products should ,e sold +ithBagels to pro%ote sale of otato chips7
Bina Nusantara Uni ersit!
1)
-
8/20/2019 01. IntroductionOverview of Data Mining
17/32
.ssociation 4ule Disco er!: .pplication
2( Super%arket shelf %anage%ent'*oal: To identif! ite%s that are ,ought together ,!su cientl! %an! custo%ers'.pproach: rocess the point-of-sale data collected +ith,arcode scanners to 5nd dependencies a%ong ite%s'. classic rule --
( If a custo%er ,u!s diaper and %ilk/ then heis er! likel! to ,u! ,eer'
Bina Nusantara Uni ersit!
1;
-
8/20/2019 01. IntroductionOverview of Data Mining
18/32
March )/ 201)
Data Mining: oncepts and Techni&ues 1A
Data Mining Hunction:
-
8/20/2019 01. IntroductionOverview of Data Mining
19/32
lassi5cation: .pplication 1( Direct Marketing
*oal: 4educe cost of %ailing ,! targeting a setof consu%ers likel! to ,u! a ne+ cell-phoneproduct'.pproach:
( Use the data for a si%ilar product introduced ,efore'( e kno+ +hich custo%ers decided to ,u! and +hich
decided other+ise' This {buy, don’t buy} decisionfor%s the class attribute '
Bina Nusantara Uni ersit!
1C
-
8/20/2019 01. IntroductionOverview of Data Mining
20/32
lassi5cation: .pplication 2
( Hraud Detection*oal: redict fraudulent cases in credit cardtransactions'.pproach:
( Use credit card transactions and the infor%ation onits account-holder as attri,utes'
hen does a custo%er ,u!/ +hat does he ,u!/ ho+often he pa!s on ti%e/ etc
Bina Nusantara Uni ersit!
20
-
8/20/2019 01. IntroductionOverview of Data Mining
21/32
lassi5cation: .pplication "( usto%er .ttrition@ hurn:
*oal: To predict +hether a custo%er is likel! to ,e lost to aco%petitor'.pproach:
( Use detailed record of transactions +ith eachof the past and present custo%ers/ to 5ndattri,utes'
o+ often the custo%er calls/ +here he calls/ +hat ti%e-of-the da!he calls %ost/ his 5nancial status/ %arital status/ etc'
( Ja,el the custo%ers as lo!al or dislo!al'
Bina Nusantara Uni ersit!
21
-
8/20/2019 01. IntroductionOverview of Data Mining
22/32
March )/ 201)
Data Mining: oncepts and Techni&ues 22
Data Mining Hunction:
-
8/20/2019 01. IntroductionOverview of Data Mining
23/32
lustering: .pplication 1( Market Seg%entation:
*oal: su,di ide a %arket into distinct su,setsof custo%ers +here an! su,set %a!concei a,l! ,e selected as a %arket target to,e reached +ith a distinct %arketing %i$'.pproach:
( ollect diQerent attri,utes of custo%ers ,ased ontheir geographical and lifest!le related infor%ation'
( Hind clusters of si%ilar custo%ers'
Bina Nusantara Uni ersit!
2"
-
8/20/2019 01. IntroductionOverview of Data Mining
24/32
lustering: .pplication 2( Docu%ent lustering:
*oal: To 5nd groups of docu%ents that are si%ilar toeach other ,ased on the i%portant ter%s appearing inthe%'
.pproach: To identif! fre&uentl! occurring ter%s in eachdocu%ent' Hor% a si%ilarit! %easure ,ased on thefre&uencies of diQerent ter%s'
Bina Nusantara Uni ersit!
2G
-
8/20/2019 01. IntroductionOverview of Data Mining
25/32
-
8/20/2019 01. IntroductionOverview of Data Mining
26/32
March )/ 201)
Data Mining: oncepts and Techni&ues 2)
Data Mining: on uence of Multiple Disciplines
Data Mining
Machineearning
Statistics
/pplications
/lgorithm
6attern$ecognition
7igh-6er1ormanceComp%ting
is%ali ation
Data,ase9echnolog'
-
8/20/2019 01. IntroductionOverview of Data Mining
27/32
5ntegration of a )ata ining4stems with a )atabase or
)ata 0arehouse 4stems
Bina NusantaraUni ersit! 2;
-
8/20/2019 01. IntroductionOverview of Data Mining
28/32
March )/ 201)
Data Mining: oncepts and Techni&ues 2A
Integration of Data Mining and Data arehousing
( )ata mining s4stems, )7 , )ata warehouse s4stems coupling
No coupling/ loose-coupling/ se%i-tight-coupling/ tight-coupling
( On8line anal4tical mining data
integration of %ining and J. technologies
( 5nteractive mining multi8level knowledge
Necessit! of %ining kno+ledge and patterns at diQerent le els of
a,straction ,! drilling@rolling/ pi oting/ slicing@dicing/ etc'( 5ntegration of multiple mining functions
haracteri ed classi5cation/ 5rst clustering and then association
-
8/20/2019 01. IntroductionOverview of Data Mining
29/32
March )/ 201)
Data Mining: oncepts and Techni&ues 2C
.rchitecture: T!pical Data Mining S!ste%
data cleaning" integration" and selection
Data,ase or DataWareho%se Server
Data Mining :ngine
6attern :val%ation
*raphical User ;nter1ace
-
8/20/2019 01. IntroductionOverview of Data Mining
30/32
a6or 5ssues in )ata
ining
Bina NusantaraUni ersit! "0
-
8/20/2019 01. IntroductionOverview of Data Mining
31/32
March )/ 201)
Data Mining: oncepts and Techni&ues "1
Ma or hallenges in Data Mining( # cienc! and scala,ilit! of data %ining algorith%s
( arallel/ distri,uted/ strea%/ and incre%ental %ining %ethods
( andling high-di%ensionalit!
( andling noise/ uncertaint!/ and inco%pleteness of data
( Incorporation of constraints/ e$pert kno+ledge/ and ,ackgroundkno+ledge in data %ining
( attern e aluation and kno+ledge integration
( Mining di erse and heterogeneous kinds of data: e'g'/
,ioinfor%atics/ e,/ soft+are@s!ste% engineering/ infor%ationnet+orks
( .pplication-oriented and do%ain-speci5c data %ining
( In isi,le data %ining
-
8/20/2019 01. IntroductionOverview of Data Mining
32/32
Bina Nusantara
)ilan6utkan ke pert. &%)ata "re8processing