semantic and social intraweb for corporate intelligence and watch
-
Upload
fabien-gandon -
Category
Technology
-
view
1.404 -
download
1
Transcript of semantic and social intraweb for corporate intelligence and watch
ISICIL semantic and social intraweb for corporate intelligence and watch
ANR project CONTINT 2009-2011
Fabien Gandon, http://fabien.info
Leader Wimmics research team (INRIA, CNRS, Univ. Nice)
W3C AC Rep. for INRIA
isicil.inria.fr
• enterprise social networking
• business intelligence, watching, monitoring
• communities of interest, of practice, of experts
reconcile latest viral applications of the web with formal models and business processes
new tools to support business intelligence and technological watch
interfaces of web 2.0 app. for interaction (blog, wikis, social bookmarking, feeds, etc.)
semantic web formalisms and processing
social epistemology as theoretical framework
ISICIL
consortium
INRIA - Wimmics team (leader)
CNRS - I3S – KEWI
Telecom ParisTech
UTT - Tech-CICO
ADEME
Orange Labs R&D
proposed overview… integrating requirement analysis methods
examples of challenges and derived functionalities
overview of this open-source platform
http://isicil.inria.fr
analyze and model key business processes
Analyze interactions between members of the group ADEME « roadmap for urban mobility »
campaigns of questionnaires at Orange Labs
trend analysis of intelligence market and watch
comparison of the APIs, widgets and other applications
usage analysis an specification
shared online referential
ARIS portal for ISICIL: http://aris2.utt.fr:9090/businesspublisher/
convergence matrix
detections of needs or redundancies in key scenarios
Etapes des scénarios Fonctionnalités identifiées Fonctions SI
Présenter problématique au SVIC Mailing, Q&A Envoyer
Demander ce qui est incontournable et ce que font les autres ingénieurs
Consultation d’experts Extraire, filtrer
Prendre en compte demande Workflow, Outils de collaboration communiquer
Préparer requêtes Moteur de recherche, équation de
recherche
rechercher
Recueillir résultats Abonnement, push… Extraire, annoter
Vérifier pertinence des résultats Analyses, outils de filtrage filtrer
Informer l'ingénieur Messagerie électronique, chat,
vidéo-conférence
envoyer
S'approprier les résultats et les requêtes
Equation de recherche, profil, tags Annoter, organiser
Devenir le destinataire des alertes
Diffusion par profil diffuser
proposing functionalities & prototypes
Prioritization of functionalities Frequent functionalities and dependencies
assited structuring of folksonomies
flat folksonomies web 2.0
[Limpens et al.]
pollution
soil pollution
has narrower
pollutant energy
related related
thesaurus
?
SKOS
global giant graph link users, actions, knowledge, resources, groups, etc.
#Freddy
#bk81
hasBookmark
hasTag
#tag27
industry
hasLabel
#Fabien
#bk34
#tag92
industries
hasBookmark
hasLabel hasTag
folksonomies → ontologies contributions…
… [Mika, 2005] hierarchies / community inclusion.
… [Heymann et al., 2006] hierarchies / centrality in graph Tag-Ressource
… [Schmitz, 2006] hierarchies / conditional probabilies & co-occurrence
… [Cattuto et al., 2008] [Markines et al., 2009] different metrics
… [Specia et al., 2007] [Begelman et al., 2006] clustering de tags
variations around metrics & space (tag-resource-user).
[Limpens et al ]
folksonomies + ontologies contributions…
... [Gruber, 2005] [Tanasescu et al., 2007] tagging tags
… [Specia et al., 2007][Cattuto et al., 2008][Giannakidou et al., 2008] [Ronzano et al., 2008] [Tesconi et al., 2008] automated structuring using external linguistic resources.
... [Good et al., 2007] manual disambiguation referencing a vocabulary
… [Passant et al., 2007] manual disambiguation referencing a thesaurus
… [Huynh-Kim Bang et al. , 2008] structured tagging “Paris<France”
[Limpens et al ]
ontologies → folksonomies contributions…
… [Gruber, 2005] [Newman et al., 2005] ontology of the tagging act
… [Breslin et al., 2005] SIOC resources shared on social web sites
… [Kim et al., 2007] SCOT representing tags and their cloud
… [Passant et al., 2008] MOAT, associating a meaning to a tag.
[Limpens et al ]
SoA… you are here Computed Tag
similarity Tag-Concept
mapping Users' contrib.
Sem-Web formalism
Multi-points of view
Angeletou et al. (2008)
✓ ✓ ✓
Huynh-Kim Bang et al. (2008)
✓ ✓
Passant & Laublet(2008)
✓ ✓ ✓
Lin & Davis (2010) ✓ ✓ ✓ ✓
Braun et al. (2007) ✓ ✓
Limpens et al. (2010)
✓ ✓ ✓ ✓
pollution
pollutant pollution
pollution pollution pollution pollution Soil pollutions
edition distances
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
Close match Aire sous f cm_A
Tag1 broader than Tag2 Aire sous f t1bt2_A
Related Aire sous f rel_A
evaluating distances c.f. [Limpens et al.]
determine thesaurus relations
Comparison of the mean value of the JaroWinkler metric for each type of semantic relation
Mean value of the difference s(t1,t2) - s(t2,t1) with s being the Monge-Elkan QGram metric for each set of tag pairs.
tag1 tag2 tag3
tag1 freq (tag1) cooc (tag1, tag2) cooc (tag1, tag3)
tag2 cooc (tag2, tag1) freq (tag2) cooc (tag2, tag3)
tag3 cooc (tag3, tag1) cooc (tag3, tag2) freq (tag3)
contextual distance: co-occurrence vector cosine distance to detect related tags
21
2121,cos
tagtag
tagtagtagtag
[Cattuto et al. 2008]
environment
interest comunity inclusion detecting narrower tags [Mika et al.]
agriculture
12 ,,21
tagrhasnarrowetaguseruser tagtag
combining metrics edition distances Monge-Elkan Soundex, JaroWinkler,
asymmetry Monge-Elkan Qgram
contextual metric cosinus vector co-occurring tags
social metrics inclusion of communities of interest
football
sport
+
+
handling conflicts arbitration rules
IF num(narrower)/num(broader) ≥ c
THEN narrower/broader
ELSE related
purely automatic
conflicting
arbitrated conflict
debated
consensual
folksonomy enrichment lifecycle
ADDING TAGS
Automatic processing
User-centric structuring
Detect conflicts
Global structuring
Flat folksonomy
Structured folksonomy
[Limpens et al.]
social networks networking is not that new e.g. commerce
social network analysis beginning of the 20th century
Graphs, graphs, graphs
Fabien owner
author
Researcher
type
doc.html author
Semantic web is not antisocial
Adult
Researcher
sub property sub class
semantic web
title
Fabien
Marco Guillaume
Nicolas
Michel
Rémi
social network analysis
),(;)( pxrelxpdin
4)( Guillaumedin
owner
Adult
type
semantic social network analysis contributions…
… [Goldbeck et al 2003] propagating trust
… [Finin et al 2005] power law of degrees & community struct
… [Paolillo et al 2006] classical SNA on FOAF from LiveJournal
… [Goldbeck et Rothstein 2008] merging FOAF profiles
… [Anyanwu et al 2007] [Kochut et al 2007] [Corby et al 2004] [Corby 2008] [Baget et al, 2007] paths in SPARQL
… [Ereteo et al 2009] type-parameterized SNA and SemTagP
… [Rowe et al. 2011] User Behaviour in Online Communities
[Ereteo et al ]
Directed
networks
Weighted
networks
Labelled
network
Parametrized
operators Network size
Graph Theory ✔ ✔ ✔ 106 nodes
107 edges
[Brandes 2009] ✔ ✔ ✔ 104 nodes
[Paolillo & Wright
2006] ✔ ✔ ~ 104 nodes
~ 105 edges
[San Martin &
Gutierrez 2009] ✔ ✔ ~ 104 nodes
~ 104 - 105
edges
SEMSNA ✔ … ✔ ✔ 104 nodes
~ 105 edges
[Erétéo et al.]
parent sibling
mother father brother sister
colleague
knows Gérard
Fabien
Mylène
Michel
Yvonne
<family> (guillaume)=5 d (guillaume)=3 guillaume
c.f. [Erétéo et al.]
eg. typed proximity centrality
select distinct ?y ?to
pathLength($path) as ?length
(1/sum(?length)) as ?centrality
where{
?y s (foaf:knows*/rel:worksWith)::$path ?to
}group by ?y
1
GEx
worksWithknows
c
worksWithknows xkglengthkC ,/*/*
ipernity.com dataset in RDF
61 937 actors & 494 510 relationships –18 771 family links between 8 047 actors –136 311 friend links implicating 17 441 actors –339 428 favorite links for 61 425 actors etc.
c.f. [Erétéo et al.]
some interpretations validated with managers of ipernity.com
friendOf, favorite, message, comment
small diameter, high density
family as expected: large diameter, low density
favorite: highly centralized around Ipernity animator.
friendOf, family, message, comment: power law of
some interpretations existence of a largest component in all sub networks "the effectiveness of the social network at doing its job" [Newman 2003]
0
10000
20000
30000
40000
50000
60000
70000
number actors size largest component
knows
favorite
friend
family
message
comment
PERFORMANCES & LIMITS Knows 0.71 s 494 510
Favorite 0.64 s 339 428
Friend 0.31 s 136 311
Family 0.03 s 18 771
Message 1.98 s 795 949
Comment 9.67 s 2 874 170
Knows 20.59 s 989 020
Favorite 18.73 s 678 856
Friend 1.31 s 272 622
Family 0.42 s 37 542
Message 16.03 s 1 591 898
Comment 28.98 s 5 748 340
Shortest paths used
to calculate
Knows Path length <= 2: 14m 50.69s
Path length <= 2: 2h 56m 34.13s
Path length <= 2: 7h 19m 15.18s
100 000
1 000 000
2 000 000
Favorite Path length <= 2: 5h 33m 18.43s 2 000 000
Friend Path length <= 2: 1m 12.18 s
Path length <= 2: 2m 7.98 s
1 000 000
2 000 000
Family Path length <= 2 : 27.23 s
Path length <= 2 : 2m 9.73 s
Path length <= 3 : 1m 10.71 s
Path length <= 4 : 1m 9.06 s
1 000 000
3 681 626
1 000 000
1 000 000
)(GComp rel
)(, yD rel 1
)(bC relb
time projections
4
Philippe colleague
2
colleague
supervisor
Degree
Guillaume
Gérard
Fabien
Mylène
Michel
Yvonne
Ivan Peter
example of SemSNA
ADD {
?y semsna:hasInDegree _:b0
_:b0 semsna:forProperty param[type]
_:b0 rdf:value ?indegree
_:b0 semsna:hasLength param[length]
}
SELECT ?y count(?x) as ?indegree {
?x $path ?y
filter(match($path, star(param[type])))
filter(pathLength($path)<= param[length])
} group by ?y
parameterized in-degree
)(dolengthtype, yin
[PhD Guillaume Erétéo]
conceptual and software framework for a semantic analysis of social networks using semantic web frameworks
hierarchical algorithms
output dendrograms of larger and larger communities from top to bottom.
• agglomerative algorithms [Donetti & Munoz 2004] [Zhou & Lipowsky 2004] [Xu et al 2007] [Newman 2004]
• divisive algorithms [Girvan & Newman 2002] [Radicchi et al 2004]
[Eretéo et al., 2011]
heuristic based algorithms
• similarity with electrical networks [Wu 2004]
• random walk [Dongen 2000] [Pons et al 2005]
• label propagation [Raghavan et al 2007]
[Eretéo et al., 2011]
tags to detect and label communities
extension of algorithm RAK/LP : from random labels to structured tags
salt, water
pepper, wine
mustard
rugby, foot
foot, movie
hockey sport sport
sport
condiment
condiment condiment
[Eretéo et al., 2011]
experimented algorithm 1. Algorithm SemTagP(RDFGraph network, Type relation)
2. DO
3. old_network = network
4. FOREACH user in network.users
5. user.tag = mostUsedNeighborTag(user, relationType)
6. END FOREACH
7. WHILE modularity(network) > modularity(old_network)
8. RETURN old_network
[Eretéo et al., 2011]
inject semantics here
semantic tag propagation exploit folksonomy for label assignment
a d f
e
g c
b
wiki mobile
[Eretéo et al., 2011]
sweetwiki
mobile inria
mobile
semantic tag propagation apply social pressure of RAK/LP
a d f
e
g c
b
wiki mobile
[Eretéo et al., 2011]
sweetwiki
mobile inria
mobile
semantic tag propagation take thesaurus into account in propagating
a d f
e
g c
b
wiki mobile
[Eretéo et al., 2011]
sweetwiki
mobile inria
mobile
wiki
sweetwiki mediawiki
skos:narrower
semantic tag propagation take thesaurus into account in propagating
a d f
e
g c
b
wiki mobile
[Eretéo et al., 2011]
sweetwiki
wiki inria
mobile
wiki
sweetwiki mediawiki
skos:narrower
semantic tag propagation etc. leading to 2 communities
a d f
e
g c
b
wiki mobile
[Eretéo et al., 2011]
wiki
wiki mobile
mobile
applied to Ademe Ph.D. network 1 853 agents
1 597 academic supervisors
256 ADEME engineers.
13 982 relationships
10 246 rel:worksWith
3 736 rel:colleagueOf
6 583 tags
3 570 skos:narrower relations between 2 785 tags
-1,2
-1,1
-1
-0,9
-0,8
-0,7
-0,6
-0,5
-0,4
-0,3
-0,2
-0,1
0
0,1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
R AK
T agP
S emT agP
controlled S emT agP (env)
controlled S emT agP (env, energ)
controlled S emT agP (env, energ, model)
MODULARITY COMPARISONS X axis: propagation iterations, Y axis: modularity
results 1. pollution
2. sustainable
development
3. energy
4. chemistry
5. air pollution
6. metals
7. biomass
8. wastes
75
to know more deployment & test campaign (4… 20… +) .
deliverables and publications http://isicil.inria.fr
open source code on INRIA forge https://gforge.inria.fr/projects/isicil/
models http://ns.inria.fr/
social web 2.0 epistemology semantic web
theoretical framework
extensible models
process and interaction
services and interfaces