Quick R Tutorial for Data Mining - cis.csuohio.edu

Post on 23-Dec-2021

6 views 0 download

Transcript of Quick R Tutorial for Data Mining - cis.csuohio.edu

Quick R Tutorial for Data MiningQuick R Tutorial for Data MiningQuick R Tutorial for Data MiningQuick R Tutorial for Data Mining

CIS 660 Data Mining

Sunnie Chung1

Getting data from twitter Streaming API Getting data from twitter Streaming API Getting data from twitter Streaming API Getting data from twitter Streaming API

• Install Tweepy

CIS 660 Data Mining

Sunnie Chung2

Getting data from twitter Streaming API Getting data from twitter Streaming API Getting data from twitter Streaming API Getting data from twitter Streaming API

• Install “pip”

For install pip, first download “get-pip.py” then run following commands:

• “sudo python get-pip.py”

CIS 660 Data Mining

Sunnie Chung3

Getting data from twitter Streaming APIGetting data from twitter Streaming APIGetting data from twitter Streaming APIGetting data from twitter Streaming API

• “sudo pip install tweepy”

CIS 660 Data Mining

Sunnie Chung4

Getting data from twitter Streaming API Getting data from twitter Streaming API Getting data from twitter Streaming API Getting data from twitter Streaming API

• Now,Create a twitter account.

• Go to https://apps.twitter.com/ and log in with your twitter credentials.

• Click "Create New App"

• Fill out the form, agree to the terms, and click "Create your Twitter application"

• In the next page, click on "API keys" tab, and copy your "API key" and "API secret".

• Scroll down and click "Create my access token", and copy your "Access token" and "Access token secret".

CIS 660 Data Mining

Sunnie Chung5

Connecting to Twitter streaming API and downloading Connecting to Twitter streaming API and downloading Connecting to Twitter streaming API and downloading Connecting to Twitter streaming API and downloading datadatadatadata

CIS 660 Data Mining

Sunnie Chung6

CIS 660 Data Mining

Sunnie Chung7

Extract user specific dataExtract user specific dataExtract user specific dataExtract user specific data

• Consider any JSON line, it contains User data for example

• { ,User:{ID:{ },Name:{ },------ }, }

• Extract below portion from JSON data.

• User:{ID:{ }, Name:{ },----------- }

• From above line, extract individual attributes such as

• ID:{ }

• Name:{ }

• Language:{ }

• Again get value from JSON data such as 12345 and Sunnie in below example.

• ID:{12345}

• Name:{Sunnie}

CIS 660 Data Mining

Sunnie Chung8

CIS 660 Data Mining

Sunnie Chung9

Data Mining with Twitter DataData Mining with Twitter DataData Mining with Twitter DataData Mining with Twitter Data

CIS 660 Data Mining

Sunnie Chung10

CIS 660 Data Mining

Sunnie Chung11

Twitter DataTwitter DataTwitter DataTwitter Data

CIS 660 Data Mining

Sunnie Chung12

Twitter DataTwitter DataTwitter DataTwitter Data

CIS 660 Data Mining

Sunnie Chung13

Association Rule MiningAssociation Rule MiningAssociation Rule MiningAssociation Rule Mining

CIS 660 Data Mining

Sunnie Chung14

• Data Preprocessing

CIS 660 Data Mining

Sunnie Chung15

Association Rule MiningAssociation Rule MiningAssociation Rule MiningAssociation Rule Mining

CIS 660 Data Mining

Sunnie Chung16

Association Rule MiningAssociation Rule MiningAssociation Rule MiningAssociation Rule Mining

CIS 660 Data Mining

Sunnie Chung17

Association Rule MiningAssociation Rule MiningAssociation Rule MiningAssociation Rule Mining

CIS 660 Data Mining

Sunnie Chung18

Association Rule MiningAssociation Rule MiningAssociation Rule MiningAssociation Rule Mining

CIS 660 Data Mining

Sunnie Chung19

Association Rule MiningAssociation Rule MiningAssociation Rule MiningAssociation Rule Mining

• Removing Redundancy

CIS 660 Data Mining

Sunnie Chung20

Association Rule MiningAssociation Rule MiningAssociation Rule MiningAssociation Rule Mining

CIS 660 Data Mining

Sunnie Chung21

Association Rule MiningAssociation Rule MiningAssociation Rule MiningAssociation Rule Mining

CIS 660 Data Mining

Sunnie Chung22

Association Rule MiningAssociation Rule MiningAssociation Rule MiningAssociation Rule Mining

CIS 660 Data Mining

Sunnie Chung23

Association Rule MiningAssociation Rule MiningAssociation Rule MiningAssociation Rule Mining

CIS 660 Data Mining

Sunnie Chung24

Decision TreeDecision TreeDecision TreeDecision Tree

CIS 660 Data Mining

Sunnie Chung25

Decision TreeDecision TreeDecision TreeDecision Tree

CIS 660 Data Mining

Sunnie Chung26

Decision TreeDecision TreeDecision TreeDecision Tree

• Prune decision tree:

CIS 660 Data Mining

Sunnie Chung27

Neural Networks Neural Networks Neural Networks Neural Networks

CIS 660 Data Mining

Sunnie Chung28

Neural Networks Neural Networks Neural Networks Neural Networks

CIS 660 Data Mining

Sunnie Chung29

Neural Networks Neural Networks Neural Networks Neural Networks

CIS 660 Data Mining

Sunnie Chung30

Neural NetworksNeural NetworksNeural NetworksNeural Networks

CIS 660 Data Mining

Sunnie Chung31

Neural NetworksNeural NetworksNeural NetworksNeural Networks

CIS 660 Data Mining

Sunnie Chung32

Neural NetworksNeural NetworksNeural NetworksNeural Networks

CIS 660 Data Mining

Sunnie Chung33

KKKK----Nearest Neighbor ClassificationNearest Neighbor ClassificationNearest Neighbor ClassificationNearest Neighbor Classification

CIS 660 Data Mining

Sunnie Chung34

KKKK----Nearest Neighbor Classification Nearest Neighbor Classification Nearest Neighbor Classification Nearest Neighbor Classification

• Data Preprocessing

CIS 660 Data Mining

Sunnie Chung35

KKKK----Nearest Neighbor Classification Nearest Neighbor Classification Nearest Neighbor Classification Nearest Neighbor Classification

CIS 660 Data Mining

Sunnie Chung36

KKKK----Nearest Neighbor Classification Nearest Neighbor Classification Nearest Neighbor Classification Nearest Neighbor Classification

CIS 660 Data Mining

Sunnie Chung37

KKKK----Nearest Neighbor Classification Nearest Neighbor Classification Nearest Neighbor Classification Nearest Neighbor Classification

CIS 660 Data Mining

Sunnie Chung38

Bayesian ClassifierBayesian ClassifierBayesian ClassifierBayesian Classifier

CIS 660 Data Mining

Sunnie Chung39

Bayesian ClassifierBayesian ClassifierBayesian ClassifierBayesian Classifier

CIS 660 Data Mining

Sunnie Chung40

Bayesian ClassifierBayesian ClassifierBayesian ClassifierBayesian Classifier

CIS 660 Data Mining

Sunnie Chung41

Bayesian ClassifierBayesian ClassifierBayesian ClassifierBayesian Classifier

CIS 660 Data Mining

Sunnie Chung42

Support Vector MachineSupport Vector MachineSupport Vector MachineSupport Vector Machine

CIS 660 Data Mining

Sunnie Chung43

Support Vector MachineSupport Vector MachineSupport Vector MachineSupport Vector Machine

CIS 660 Data Mining

Sunnie Chung44

Support Vector MachineSupport Vector MachineSupport Vector MachineSupport Vector Machine

CIS 660 Data Mining

Sunnie Chung45

Support Vector MachineSupport Vector MachineSupport Vector MachineSupport Vector Machine

CIS 660 Data Mining

Sunnie Chung46

KKKK----means clusteringmeans clusteringmeans clusteringmeans clustering

• Data Preprocessing

CIS 660 Data Mining

Sunnie Chung47

KKKK----means clusteringmeans clusteringmeans clusteringmeans clustering

CIS 660 Data Mining

Sunnie Chung48

KKKK----means clusteringmeans clusteringmeans clusteringmeans clustering

CIS 660 Data Mining

Sunnie Chung49

KKKK----means clusteringmeans clusteringmeans clusteringmeans clustering

CIS 660 Data Mining

Sunnie Chung50

KKKK----means clusteringmeans clusteringmeans clusteringmeans clustering

CIS 660 Data Mining

Sunnie Chung51

KKKK----MedoisMedoisMedoisMedois ClusteringClusteringClusteringClustering

CIS 660 Data Mining

Sunnie Chung52

KKKK----MedoisMedoisMedoisMedois ClusteringClusteringClusteringClustering

CIS 660 Data Mining

Sunnie Chung53

KKKK----MedoisMedoisMedoisMedois ClusteringClusteringClusteringClustering

CIS 660 Data Mining

Sunnie Chung54

KKKK----MedoisMedoisMedoisMedois ClusteringClusteringClusteringClustering

CIS 660 Data Mining

Sunnie Chung55

KKKK----MedoisMedoisMedoisMedois ClusteringClusteringClusteringClustering

CIS 660 Data Mining

Sunnie Chung56

KKKK----MedoisMedoisMedoisMedois ClusteringClusteringClusteringClustering

CIS 660 Data Mining

Sunnie Chung57

Hierarchical ClusteringHierarchical ClusteringHierarchical ClusteringHierarchical Clustering

CIS 660 Data Mining

Sunnie Chung58

Hierarchical ClusteringHierarchical ClusteringHierarchical ClusteringHierarchical Clustering

CIS 660 Data Mining

Sunnie Chung59

Hierarchical ClusteringHierarchical ClusteringHierarchical ClusteringHierarchical Clustering

CIS 660 Data Mining

Sunnie Chung60

Hierarchical ClusteringHierarchical ClusteringHierarchical ClusteringHierarchical Clustering

CIS 660 Data Mining

Sunnie Chung61

Hierarchical ClusteringHierarchical ClusteringHierarchical ClusteringHierarchical Clustering

CIS 660 Data Mining

Sunnie Chung62

Hierarchical ClusteringHierarchical ClusteringHierarchical ClusteringHierarchical Clustering

CIS 660 Data Mining

Sunnie Chung63

Hierarchical ClusteringHierarchical ClusteringHierarchical ClusteringHierarchical Clustering

CIS 660 Data Mining

Sunnie Chung64

Hierarchical ClusteringHierarchical ClusteringHierarchical ClusteringHierarchical Clustering

CIS 660 Data Mining

Sunnie Chung65

Cluster ValidationCluster ValidationCluster ValidationCluster Validation

CIS 660 Data Mining

Sunnie Chung66

Cluster ValidationCluster ValidationCluster ValidationCluster Validation

CIS 660 Data Mining

Sunnie Chung67

Cluster ValidationCluster ValidationCluster ValidationCluster Validation

CIS 660 Data Mining

Sunnie Chung68

Cluster ValidationCluster ValidationCluster ValidationCluster Validation

CIS 660 Data Mining

Sunnie Chung69

Density based ClusteringDensity based ClusteringDensity based ClusteringDensity based Clustering

CIS 660 Data Mining

Sunnie Chung70

Density based ClusteringDensity based ClusteringDensity based ClusteringDensity based Clustering

CIS 660 Data Mining

Sunnie Chung71

Density based ClusteringDensity based ClusteringDensity based ClusteringDensity based Clustering

CIS 660 Data Mining

Sunnie Chung72

Density based ClusteringDensity based ClusteringDensity based ClusteringDensity based Clustering

CIS 660 Data Mining

Sunnie Chung73

Density based ClusteringDensity based ClusteringDensity based ClusteringDensity based Clustering

CIS 660 Data Mining

Sunnie Chung74

Outlier DetectionOutlier DetectionOutlier DetectionOutlier Detection

CIS 660 Data Mining

Sunnie Chung75

Outlier DetectionOutlier DetectionOutlier DetectionOutlier Detection

CIS 660 Data Mining

Sunnie Chung76

Term frequency Term frequency Term frequency Term frequency –––– Inverse document frequency WeightingInverse document frequency WeightingInverse document frequency WeightingInverse document frequency Weighting

CIS 660 Data Mining

Sunnie Chung77

Term frequency Term frequency Term frequency Term frequency –––– Inverse document frequency WeightingInverse document frequency WeightingInverse document frequency WeightingInverse document frequency Weighting

CIS 660 Data Mining

Sunnie Chung78

Term frequency Term frequency Term frequency Term frequency –––– Inverse document frequency WeightingInverse document frequency WeightingInverse document frequency WeightingInverse document frequency Weighting

CIS 660 Data Mining

Sunnie Chung79

Term frequency Term frequency Term frequency Term frequency –––– Inverse document frequency WeightingInverse document frequency WeightingInverse document frequency WeightingInverse document frequency Weighting

CIS 660 Data Mining

Sunnie Chung80

Term frequency Term frequency Term frequency Term frequency –––– Inverse document frequency WeightingInverse document frequency WeightingInverse document frequency WeightingInverse document frequency Weighting

CIS 660 Data Mining

Sunnie Chung81