Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

32
Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby

Transcript of Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

Page 1: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby

Page 2: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

computer-mediated networks as social networks [Wellman, 2001]

Page 3: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

social media landscapesocial web amplifies social network effects

Page 4: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

overwhelming flow of social data

Page 5: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

social network analysisproposes graph algorithms to characterize the structure of a social network, strategic positions, and networking activities

Page 6: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

social network analysisglobal metrics and structure

community detection distribution of actors and activities

density and diameter cohesion of the network

Page 7: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

social network analysisstrategic positions and actors

degree centralitylocal attention

Page 8: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

social network analysisstrategic positions and actors

betweenness centralityreveal broker"A place for good ideas" [Burt, 2004]

Page 9: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

semantic social networkshttp://sioc-project.org/node/158

Page 10: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

(guillaume)=5

Gérard

FabienMylène

MichelYvonne

father sist

er

mother

colleague

colle

ague

d

Page 11: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

parentsibling

motherfatherbrothersister

colleague

knowsGérard

FabienMylène

MichelYvonne

father sister

mother

colleague

colle

ague

<family>d (guillaume)=3

Page 12: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

but…SPARQL is not expressive enough to meet SNA requirements for global metric querying of social networks (density, betweenness centrality, etc.).

[San Martin & Gutierrez 2009]

Page 13: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

classic SNA on semantic webrich graph representations reduced to simpleuntyped graphs [Paolillo & Wright, 2006]

foaf:knows

foaf:interest

Page 14: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

semantic SNA stackexploit the semantic of social networks

Page 15: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

SPARQL extensionsCORESE semantic search engine implementing semantic web languagesusing graph-based representations

Page 16: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

grouping resultsnumber of followers of a twitter user

select ?y count(?x) as ?indegree where{

?x twitter:follow ?y

} group by ?y

Page 17: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

path extractionpeople knowing, knowing, (...) colleagues of someone

?x sa (foaf:knows*/rel:worksWith)::$path ?yfilter(pathLength($path) <= 4)

Regular expression operators are: / (sequence) ; | (or) ; * (0 or more) ; ? (optional) ; ! (not)

Path characteristics: i to allow inverse properties, s to retrieve only one shortest path, sa to retrieve all shortest paths.

Page 18: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

full examplecloseness centrality through knows and worksWith

select distinct ?y ?to pathLength($path) as ?length (1/sum(?length)) as ?centrality

where{?y s (foaf:knows*/rel:worksWith)::$path ?to

}group by ?y

1

GExworksWithknows

cworksWithknows xkglengthkC ,/*/*

Page 19: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

Qualified component

Qualified in-degree

Qualified diameter

Closenness Centrality

Betweenness Centrality

Number of geodesics between from and to

Qualified degree

Number of geodesics between from and togoing through b

Page 20: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

SemSNA an ontology of SNAhttp://ns.inria.fr/semsna/2009/06/21/voc

Page 21: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

add to the RDF graphsaving the computed degrees for incremental calculations

CONSTRUCT{ ?y semsna:hasSNAConcept _:b0 _:b0 rdf:type semsna:Degree _:b0 semsna:hasValue ?degree _:b0 semsna:isDefinedForProperty rel:family}SELECT ?y count(?x) as ?degree where{ { ?x rel:family ?y } UNION { ?y rel:family ?x }}group by ?y

Page 22: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

sister

mother

supervisor

hasSNAConcept

isDefinedForProperty

hasValue

4

colleaguecolleague

father

Philippe

hasCentralityDistance

colleague

2

colleague

supervisorcolleague

supervisor

Degree

Guillaume

Gérard

Fabien

Mylène

Michel

Yvonne

IvanPeter

Page 23: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

Ipernity

Page 24: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

using real dataextracting a real dataset from a relational database

construct { ?person1 rel:friendOf ?person2 }

select sql(<server>, <driver>, <user>, <pwd>, select user1_id, user2_id

from relations where rel = 1 ') as (?person1 , ?person2 ) where {}

Page 25: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

importing data with SemSNIhttp://ns.inria.fr/semsni/

Page 26: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

using real dataipernity.com dataset extracted in RDF61 937 actors & 494 510 relationships–18 771 family links between 8 047 actors–136 311 friend links implicating 17 441 actors –339 428 favorite links for 61 425 actors–2 874 170 comments from 7 627 actors–795 949 messages exchanged by 22 500 actors

Page 27: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

performances & limits Knows 0.71 s 494 510Favorite 0.64 s 339 428Friend 0.31 s 136 311Family 0.03 s 18 771Message 1.98 s 795 949Comment 9.67 s 2 874 170Knows 20.59 s 989 020Favorite 18.73 s 678 856Friend 1.31 s 272 622Family 0.42 s 37 542Message 16.03 s 1 591 898Comment 28.98 s 5 748 340

Shortest paths used to calculate

Knows Path length <= 2: 14m 50.69s Path length <= 2: 2h 56m 34.13sPath length <= 2: 7h 19m 15.18s 

100 0001 000 0002 000 000

Favorite Path length <= 2: 5h 33m 18.43s 2 000 000Friend Path length <= 2: 1m 12.18 s 

Path length <= 2: 2m 7.98 s1 000 0002 000 000

Family Path length <= 2 : 27.23 sPath length <= 2 : 2m 9.73 sPath length <= 3 : 1m 10.71 sPath length <= 4 : 1m 9.06 s

1 000 0003 681 6261 000 0001 000 000

)(GComp rel

)(, yD rel 1

)(bC relb

time projections

Page 28: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

some interpretationsvalidated with managers of ipernity.comfriendOf, favorite, message, comment

small diameter, high densityfamily as expected: large diameter, low densityfavorite: highly centralized around Ipernity animator. friendOf, family, message, comment: power law of degrees

and betweenness centralities, different strategic actorsknows: analyze all relations using subsumption

Page 29: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

some interpretationsexistence of a largest component in all sub networks"the effectiveness of the social network at doing its job" [Newman 2003]

0100002000030000

40000500006000070000

number actors size largest component

knows

favorite

friend

family

message

comment

Page 30: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

conclusion

directed typed graph structure of RDF/S well suited to represent social knowledge & socially produced metadata spanning both internet and intranet networks.

definition of SNA operators in SPARQL (using extensions and OWL Lite entailment) enable to exploit the semantic structure of social data.

SemSNAorganize and structure social data.

Page 31: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

perspectives semantic based community detection algorithm

SemSNA Ontologyextract complex SNA features reusing past results

support iterative or parallel approaches in the computations

a semantic SNA to foster a semantic intranet of people structure overwhelming flows of corporate social data

foster and strengthen social interactions

efficient access to the social capital [Krebs, 2008]

built through online collaboration

http://twitter.com/isicil

Page 32: Guillaume Erétéo, Michel Buffa, Fabien Gandon, Olivier Corby.

nameGuillaume Erétéo

holdsAccount

organization

mentorOf

mentorOf

holdsAccount

manage

contribute

contribute

answers

twitter.com/ereteog slideshare.net/ereteog