Mining the Connected World
Transcript of Mining the Connected World
Mining the Connected WorldEe-Peng LIM
Director, Living Analytics Research CentreProfessor, School of Information Systemsy
http://larc.smu.edu.sg
Fraunhofer IDM@NTU Workshop, 20 February 2012
Simple Statistics• How many of us are on Facebook today?
> 845 million (December 31, 2011)
• How many of us are on Twitter today?
> 300 million (June 2011)( )
Living Analytics Research Centre
Living Analytics =
Consumer & Social Insights From
Experiment-Driven Closed-Loop Analytics +g ySocietal Scale Human Networks
LARC D t S ttiLARC Data Settings
A Glimpse of LARC Research:
(a) Mining Link Formation Rules
Link Formation Rule Mining:Do relationships lead to other relationships?Do relationships lead to other relationships? • Local structures for understanding and predicting the
dynamics of large complex networks
All possible triads in a directed graph
• Previous research however does not consider the formation order of links
• We therefore study local structures for link formation in directed, labeled, temporal social networks
Link Formation Rules (LF-Rules)
• LF-rule: Rule of a node (user) forming new links to other nodes (users) based on pre-existing local link structures.
precondition The link from s to e is formed precondition as a postcondition
Mining Methodology
• Mine LF-rules from a social network with temporal links• Mine LF-rules from a social network with temporal links.• Apply randomizing technique to the network, for
estimating the expected support of LF-rules in a random graph
• Evaluate interesting rules with higher-than-expected supportsupport
Interesting LF-rules in myGamma
• Based on the Dec 2009 snapshot690k ith t l t 1 li k– ~690k users with at least 1 link
– > 9 million links (~93% friend links)
• Top 5-rules in terms of support
Interestingness scoressupport expected
supportsurprise
(supp/exp. supp)confidence
28.91% 22.41% 1.29 43.22%
28.38% 22.37% 1.27 43.1%
25 42% 13 54% 1 88 39 15%25.42% 13.54% 1.88 39.15%
24 37% 1 22% 20 06 31 98%24.37% 1.22% 20.06 31.98%
20.55% 11.49% 1.79 27.52%20.55% 11.49% 1.79 27.52%
Major Observations• Users tend to rely more on mutually trusted
friends in forming new friendship links. – R12 (right) has much higher confidence (~34% vs.
~22%) and surprise values (5.32 vs. 3.52) than R11(left)(left)
• 3.45% of users reciprocated a friend link with a pfoe link.
A Glimpse of LARC Research:A Glimpse of LARC Research:
(b) Palanteer: A Data Analytics Engine for Twitter DataEngine for Twitter Data
Palanteerhttp://palanteer.sis.smu.edu.sghttp://palanteer.sis.smu.edu.sg
tranportation
Search Box
Trending items
E t J l 12 2011Event on July 12, 2011
MRT Event
How do Singapore users feel?
How popular is Starbucks?
Palanteer – Taiwan Edition
Palanteer – Thai Edition
Conclusions• Interesting research problems in the
connected worldco ected o d• Living analytics focuses on discovering
user preferences friendship patterns anduser preferences, friendship patterns, and trends
• Living analytics is multidisciplinary• Living analytics is multidisciplinary• LARC looks forward to exciting
ll b ti ith i d t t dcollaborations with industry partners and other researchers
LARC Activities
Thank youEe-Peng LIM
http://larc.smu.edu.sg
AcknowledgmentFaculty Members: Jing JIANG, Feida ZHU, David LO, Hady LAUW
Collaborators (NTU) : Aixin SUN, Marko SKORIC, Anwitaman DATTAResearchers: Cane LEUNG, Aek Palakorn, Bingtian DAI, Agus, Nelman
PhD Students: Freddy, Hanbo, Tuan Anh, Minh Duc