Association Rule Mining in Social Network Data

20
Association Rule Mining in Social Network Data PRESENTED BY: HOSSEIN MOBASHER COURSE: DATA MINING

Transcript of Association Rule Mining in Social Network Data

Page 1: Association Rule Mining in Social Network Data

Association Rule Mining in Social Network DataPRESENTED BY: HOSSEIN MOBASHER

COURSE: DATA MINING

Page 2: Association Rule Mining in Social Network Data

19 /2

Contents• Introduction

• Related Works

• The proposed Framework

• Experimental Evaluation

• Conclusion

Page 3: Association Rule Mining in Social Network Data

19 /3

Introduction• The use of social networks has altered the way of life of online community since

last decade.

• Social data uses in:• Academic applications

• E-commerce

• Discovers the user habits and interests of different geographical online communities

• Sentimental analysis of users

• Purpose: Support analysts in decision-making and optimal resource management in businesses as well as web maintenance.

Page 4: Association Rule Mining in Social Network Data

19 /4

Introduction (continue)• The social data is one of the powerful sources of data:

• To get knowledge about social communities

• Investigate the behavior and other different aspects of the online communities

• User-generated contents (UGC) used to help online organizations to enhance their services based on user perspectives.

• The data mining techniques are effectively exploited to discover hidden, interested and meaningful knowledge from the social data.

Page 5: Association Rule Mining in Social Network Data

19 /5

Related Works• TwitterEcho

• Collect data from distributed architecture (Portuguese Twittosphere)

• Use of micro-blogging as the means to predict the political sentiment.

• TWICALL• Discovers important events, categorizes and classifies them

• NIF-T• Exploring data published on micro-blogging websites (i.e. Twitter)

Page 6: Association Rule Mining in Social Network Data

19 /6

The proposed Framework• Environment for the association rule mining to discover hidden patterns from

tweets.

Page 7: Association Rule Mining in Social Network Data

19 /7

Collecting and preprocessing of tweets

• Access tweets using Twitter API.

• Received tweets are unsuitable for the subsequent processes.• Includes information which is not required for problem under consideration

• Remove unnecessary information and transform them into items and related contextual features.

Access data using Twitter API

Remove Unnecessary Information

Transform into suitable format

Mapped into a transactional database

Page 8: Association Rule Mining in Social Network Data

19 /8

Collecting and preprocessing of tweets

• Transformed tweets are then mapped into a transactional database.• Composed of set of stems

• i.e. “Imagination is more important than knowledge” may be mapped into {imagination, important, knowledge}

Access data using Twitter API

Remove Unnecessary Information

Transform into suitable format

Mapped into a transactional database

Page 9: Association Rule Mining in Social Network Data

19 /9

Discovery of Correlations• Use apriori method to extract frequent itemset mining.

• An association rule is usually represented as: If Body then Head• If Body happens then there are more chance that Head may also happen

• It is the relationship between them

• Strength of the rule depends on association rule support and confidence

• The higher the strength of the rule, higher the association in between the terms.

• 𝑖𝑚𝑎𝑔𝑖𝑛𝑎𝑡𝑖𝑜𝑛 ⇒ 𝑘𝑛𝑜𝑤𝑙𝑒𝑑𝑔𝑒• Support = 40%

• Confidence = 70%

Page 10: Association Rule Mining in Social Network Data

19 /10

Taxonomy Generation• Automatically generates taxonomy based on tweet attributes (i.e. frequent

keywords that are generated in the previous phase).

• The more generalized or high-level concepts or correlations can be extracted.

• The taxonomy nodes represent distinct terms extracted from tweet contents• Graph extraction

• Graph partitioning and pruning

Page 11: Association Rule Mining in Social Network Data

19 /11

Taxonomy Generation (Graph extraction)• Strong correlations are detected using previous phase result.

• Generated correlations are represented in graph format• Edge: The implications present in the rule

• Vertices: Items of tweet contents

• 𝑐𝑜𝑢𝑛𝑡𝑟𝑦 ⇒ 𝑊𝑜𝑟𝑙𝑑𝑠𝑜𝑐𝑖𝑒𝑡𝑦, 𝑝𝑒𝑜𝑝𝑙𝑒 ⇒ 𝑐𝑜𝑢𝑛𝑡𝑟𝑦𝑝𝑒𝑎𝑐𝑒 ⇒ 𝑊𝑜𝑟𝑙𝑑𝑠𝑜𝑐𝑖𝑒𝑡𝑦 ⇒ 𝑊𝑜𝑟𝑙𝑑𝑠𝑜𝑐𝑖𝑒𝑡𝑦 ⇒ 𝑐𝑜𝑢𝑛𝑡𝑟𝑦

Page 12: Association Rule Mining in Social Network Data

19 /12

Taxonomy Generation (Graph partitioning and pruning)• Makes the graph compact

• Prunes edges which do not have string relevant relationship by performing vertex labeling. (Label represents level of taxonomy)

Page 13: Association Rule Mining in Social Network Data

19 /13

Analyzing Correlations• The selection and ranking of the significant correlations

• The selection is made having• A rule schema < 𝐾𝑒𝑦𝑤𝑜𝑟𝑑,∗ > ⇒< 𝑃𝑙𝑎𝑐𝑒,∗ >

• Given interesting rule items < 𝐾𝑒𝑦𝑤𝑜𝑟𝑑, 𝑆𝑐ℎ𝑜𝑜𝑙 > ⇒ < 𝑃𝑙𝑎𝑐𝑒, 𝐿𝑜𝑛𝑑𝑜𝑛 >

• The results ranked based on their support and confidence quality indexes.

Page 14: Association Rule Mining in Social Network Data

19 /14

Experimental Evaluation• The proposed framework highlights famous topical subjects (i.e. European

Union)

• The results includes 58 transactions with 209 distinct items (i.e. keywords).

• Firstly, the effectiveness is presented in two scenarios:• User behavior analysis

• Topic trend analysis

• Secondly, the effectiveness is presented as quality of generated taxonomies.

Page 15: Association Rule Mining in Social Network Data

19 /15

User Behavior Analysis• Extracted correlations allow experts to highlight hidden and potentially

interesting user behaviors.

• 𝑝𝑒𝑎𝑐𝑒 ⇒ 𝑊𝑜𝑟𝑙𝑑, 𝑠𝑜𝑐𝑖𝑒𝑡𝑦 ⇒ 𝑐𝑜𝑢𝑛𝑡𝑟𝑦, 𝑐𝑜𝑢𝑛𝑡𝑟𝑦 ⇒ 𝑊𝑜𝑟𝑙𝑑• Proposed framework automatically generates the taxonomy from the mined rules.

• The taxonomy clearly highlights the behavior of people towards the peace.

Page 16: Association Rule Mining in Social Network Data

19 /16

Topic Trend Analysis• Discovery and analysis of currently matter of contention on Twitter.

• Domain expert wants to discover subjects of topical interest for Twitter users.

• The taxonomy suggests that society as a general and people in particular are concerns with peace in the World.

Page 17: Association Rule Mining in Social Network Data

19 /17

Quality of generated taxonomies• The evaluation of taxonomy generation is measured with

• Global quality (Using geometry average)

• Local quality (Degree of correlation between non-leaf and leaf nodes)

• Spread (Number of nodes across the taxonomy to move from node to its root node in graph)

• The results are compared with the approach of • “Evolutionary Taxonomy Construction from Dynamic Tag Space”, 2010

Page 18: Association Rule Mining in Social Network Data

19 /18

Quality of generated taxonomies (continue)• Global quality remained same in both approaches.

• Produced pretty balanced local quality vs. spread measurement indexes.

• Proposed approach takes slightly less time comparing with the approach reported in.

Page 19: Association Rule Mining in Social Network Data

19 /19

Conclusion• Present the mechanism of extracting hidden correlations between contents.

• Generated correlations are helpful to understand the hidden associations among the textual and contextual features of the UGC.

• Proposed approach automatically generates taxonomy.

• The experimental results validate the efficiency and effectiveness of the proposed framework.

Page 20: Association Rule Mining in Social Network Data

Thanks for your attentions

Questions ?