Research Literature on Social Scores

SOCIAL SCORES

Supervised by Dr. Dilum Bandara

E.A.M.M Edirisinghe138211N

Outline

2

• TunkRank – A Twitter Analog to PageRank• TwitterRank – Finding Topic-sensitive

Influential Twitterers• Influence Rank – An Efficient Social

Influence Measurement • Why your Klout score is meaningless• Research Questions Addressed

3

Research Questions• How do existing systems calculate social scores ?• Which parameters are representative of a user’s

true social influence ?• What are the desirable properties of a social score ?• How to vary the parameter weights across different

topics and applications ?• How to come up with a performance efficient

algorithm ?• How to calculate social score and update it in real

time ?

4

TunkRank A Twitter Analog to PageRank

5TunkRank – A Twitter Analog to PageRank

TunkRank• Proposed by Daniel Tunkelang in 2009• Implemented by Jason Adams• Assumptions:– Influence(X) – Expected number of people who will

read a tweet that X tweets– Probability that X will read a tweet posted by Y 1/||

Following(X)|| X - a member of Followers(Y) Following(X) - the set of people that X follows– If X reads a tweet from Y, there’s a constant probability

p that X will retweet it.

6

TunkRank

• Hard to game• Will address the inflation that occurs from people

who follow in the hopes of reciprocity• Doesn’t consider how a person allocates his

attention among the people he follow

TunkRank – A Twitter Analog to PageRank

7

TwitterRank Finding Topic-sensitive Influential

Users

8

TwitterRank

Two main contributions• Report homophily in Twitter• Introduce TwitterRank to measure

topic sensitive influence of twitterers

TwitterRank – Finding Topic-sensitive Influential Users

9

Framework for the Proposed Approach

Topic

Distillation

Topic-specificRelationship

NetworkConstruction

Topic-sensitive

User InfluenceRanking


10

Dataset

• Consider a set of top-1000 Singapore-based twitterers S, |S|=996.

• Crawled all followers and friends of each s S & ∈stored them in set S’.

• Let S’’= S S’, & S* = {s|s S’’, s is from Singapore}.∪ ∈

|S*| = 6748.

For each s S*, crawled all the tweets published, T. ∈|T|=1,021,039.


11

Reciprocity in Following Relationships

• 72.4% of the twitterers follow more than 80% of their followers

• 80.5% of the twitterers have 80% of their friends follow them back

Casual following or homophily?


12

Homophily in Twitter

•Question 1: Are twitterers with “following” relationships more similar than those without according to the topics they are interested in?•Question 2:

Are twitterers with reciprocal “following” relationships more similar than those without according to the topics they are interested in?


13

Topic Modeling

• Goal:

Automatically identify the topics that twitterers are

interested in based on the tweets they published.

• Latent Dirichlet Allocation (LDA) model is applied


14

Topic Modeling Results

DT — D×T matrix

D : No of users

T : No of topics

DTij : No of times a word in user si’s tweets has

been assigned to topic tj.


15

Hypothesis Testing• Applied on a set of twitterers who publish more than 10 tweets in total,

| | = 4050.

• Row normalize the DT matrix as DT’ such that ||DT’i ·||1=1 for each

row DT’i .

• Thus each row of matrix DT’ is basically the probability distribution of twitterer si’s interest over the T topics.

• Measure the topical difference between twitterers

• Formalize each question with two sample t-tests and proves the existence of homophily in the Twitter dataset

There are twitterers who are serious in following others.

*uS


*uS

16

Topic Specific Twitter Rank• Forms a directed graph D(V,E)– edge between two twitterers if there is “following” relationship

between them– edge is directed from follower to friend.

• A topic-specific random walk model is applied to calculate the

user’s influential score.• The transition matrix for topic t, denoted as Pt . The transition

probability of surfer from follower si to friend sj is:

:

| |( , ) * ( , )

| |i a

jt t

aa s s

Tp i j sim i j

T

' '( , ) 1 | |t it jtsim i j DT DT


17

Topic Specific Twitter Rank• Topic-specific teleportation:

• The influence scores of twitters are calculated iteratively:

• Aggregation of topic-specific TwitterRank:

''t tE DT

(1 )t t t tTR P TR E

t tt

TR r TR


18

Review

• Homophily does exist• Still some follow not because of the topical

similarity

• Easy to game• Need to discuss an incremental approach to

topic distillation


19

InfluenceRank An Efficient Social Influence

Measurement

20

InfluenceRank• Define the influence of a user from two perspectives

– Users Relative Influence – Users Network Global Influence

• Define the micro blog network asSN = (G,B,I)G - link network structure,B - set of interactive behaviors between each pair of associated

usersI - set of profile information of each user

• Graph G = (V,E) V - set of nodes represented by user’s index E - set of directed edges

InfluenceRank – An Efficient Social Influence Measurement

21

InfluenceRank• Define the set of behaviors B, B = (R,C,M) (R - Retweets), (C - comment), (M - mention)• Define the profile information set I, I = (P,T,K), P - set of number of postings T - set of users’ interest tags K - set of users’ content keywords.


22

Metrics Explored• No of followers• Quality of followers• Quality of tweets

• Similarity of interests– Similarity of user interest tags

Similarity of user interest tags function TS(vi,vj)

Calculates the similarity of interest tags between node vi & vj

– Similarity of user content keywordsSimilarity of user content keywords function KS(vi,vj)

Calculates the similarity of interests between node vi & vj based on their content keyword set

Similarity of two users’ interestsSim(vi,vj) = TS(vi,vj) + KS(vi,vj)


23

User Relative Influence Rank• Consider three aspects

– The quality of postings– Ratio of retweeting behavior– Similarity of interests

• Users Relative Influence Function RI(vi,vj) RI(vi,vj) = Q(vi) + R(vi,vj) + Sim(vi,vj)

R(vi,vj) represents the ratio of retweet of user vj to vi.


24

Users Network Global Influence Rank• Combines structural features and users’ behavior

characteristics• User Network Global Influence Rank Function,

Influence(v)

Damping factor (λ) =0.85


25

Influence Rank Algorithm

Time complexity - O(e)


26

Influence Rank Algorithm• Evaluated with a dataset of Tencent Weibo, contrast

with the TunkRank algorithm • Emphasis on users’ interactive behaviors• Weight of each metric considered to measure the

user’s relative influence is taken as equal• Instead of similarity of topics considers the similarity

of keywords• Ignore the impact of negative comments and

conversations• Model is based on a snapshot of current relationships

and interactions


27

Why your Klout score is meaningless

28

Why your Klout score is meaningless ?

Klout is far more similar to a derived measurement

inconsistent and not trustworthy individually


29

What should Klout score satisfy ?

• Ordering by Klout should make sense in the real world • The score should not be easy to

game• The score should be monotonic


30

Klout score comparisons

• A set of individuals with Klout in the 40-49 range• A set of individuals with Klout in the

55-64 range• A set of individuals with Klout in the

70-79 range• A set of individuals with Klout >= 80


31

Group3 (Klout 70-79)• U1-Tim Ferriss – Author of the 4 Hour Workweek

and 4 Hour Body• U2-Jack Dorsey – Executive Chairman of Twitter

and CEO of Square• U3-Matt Cutts– Head of web spam team at Google• U4-MG Siegler – Writer for Techcrunch• U5-Klout – Influence score service• U6-David Pogue – Tech guy from the NYT• U7-Jeffrey Zeldman – designer, writer, and

publisher


32

Group3 (Klout 70-79)U1 U2 U3 U4 U5 U6 U7

As per 29th May 2011


33

Klout violates the Desirable Properties• Connecting an additional account will always increase the Klout

score.• The degree to which followers are influential seems to be

irrelevant or matter very little• The differential between number of people someone follow

seems to be irrelevant or matter very little.• In terms of value to the Klout score: follow < Retweets < unique Retwitters < unique mention

can be inconsistent• In terms of value to the Klout score: like < comment

can be inconsistent


34

Research Questions Addressed

How do existing systems calculate social scores

35



TunkRank Probability that the follower will read a tweet posted by the followeeProbability a tweet read will be retweetedNumber of followers and their influence

TwitterRank Measures the topic-sensitive influence of twitterersConsiders the similarity between friends on topicsNumber of tweets published by all friends

InflueceRank Defines a user’s relative and global influenceNumber of followersQuality of followersQuality of tweetsSimilarity of interests

Influence Measure with a Network Amplification Score

Accounts the content and conversation generated by considering indegree and outdegree of the social network for multiple levels

Which parameters are representative of a user’s true social influence• It’s not just the number of followers or the number

of friends• Link structure• Following relationship• Similarity• Interactions• Topics and communities• Quality of followers, tweets etc.

36



What are the desirable properties of a social score• Ordering by the score should make sense in

the real world• The score should not be easy to game• The score should be monotonic• Equation should be simple & easy to

understand/interpret• Should be meaningful

37



• Daniel Tunkelang. (2009, Jan 13). A Twitter Analog to PageRank [Online]. Available: http://thenoisychannel.com/2009/01/13/a-twitter-analog-to-pagerank/

• Neal Richter. (2009, Feb 18). TunkRank Scoring Improvement [Online]. Available: http://aicoder.blogspot.com/2009/02/tunkrank-scoring-improvement.html

• Jianshu Weng et al., “TwitterRank: Finding Topic-sensitive Influencial Twitterers,” in WSDM Conf., New York, USA, 2010, pp. 261-270

• Wenlong Chen et al., “InfluenceRank: An Efficient Social Influence Measurement for Millions of Users in Microblog,” in 2nd Int. Conf. on CGC, Xiangtan, 2012, pp. 563-570

• Alex Braunstein. (2011, June 01). Why your Klout score is meaningless [Online]. Available: http://alexbraunstein.com/2011/06/01/why-your-klout-score-is-meaningless/

• Sean Golliher. (2011, June 27). How I Reverse Engineered Klout Score to an R2 = 0.94. [Online]. Available:http://www.seangolliher.com/2011/uncategorized/how-i-reversed-engineered-klout-score-to-an-r2-094/

38

References

http://thenoisychannel.com/2009/01/13/a-twitter-analog-to-pagerank/

http://aicoder.blogspot.com/2009/02/tunkrank-scoring-improvement.html

http://alexbraunstein.com/2011/06/01/why-your-klout-score-is-meaningless/

http://www.seangolliher.com/2011/uncategorized/how-i-reversed-engineered-klout-score-to-an-r2-094/



Thank You

39

Research Literature on Social Scores

Social Media

Transcript of Research Literature on Social Scores