Entity-Based Semantics Emerging from Personal Awareness Streams

Post on 18-Dec-2014

781 views 1 download

description

 

Transcript of Entity-Based Semantics Emerging from Personal Awareness Streams

Capturing Entity-Based Semantics Emerging from Personal Awareness Streams

A.E. Cano, S.Tucker, F. CiravegnaThe Oak Group,

Department of Computer Science, The University of Sheffield

Outline• Introduction• Related Work• Social Stream Aggregation and Entity-Based Concept Induction

– Modelling Context with Personal Awareness Streams– Methodology– Evaluation

• Conclusions

Outline

IntroductionIntroduction

IntroductionIntroduction

Social Awareness Streams

[1] M. Naaman, J. Boase, and C. H. Lai. Is it really about me?: message content in social awareness streams. In CSCW ’10: Proceedings of the 2010 ACM conference on Computer supported cooperative work, pages 189–192, New York, NY, USA, 2010. ACM.

Collection of semi-public, natural language message produced by different users and characterised by their brevity

IntroductionIntroductionSocial Awareness Streams

[1] M. Naaman, J. Boase, and C. H. Lai. Is it really about me?: message content in social awareness streams. In CSCW ’10: Proceedings of the 2010 ACM conference on Computer supported cooperative work, pages 189–192, New York, NY, USA, 2010. ACM.

IntroductionIntroductionSocial Awareness Streams

[1] M. Naaman, J. Boase, and C. H. Lai. Is it really about me?: message content in social awareness streams. In CSCW ’10: Proceedings of the 2010 ACM conference on Computer supported cooperative work, pages 189–192, New York, NY, USA, 2010. ACM.

IntroductionIntroductionSocial Awareness Streams

[1] M. Naaman, J. Boase, and C. H. Lai. Is it really about me?: message content in social awareness streams. In CSCW ’10: Proceedings of the 2010 ACM conference on Computer supported cooperative work, pages 189–192, New York, NY, USA, 2010. ACM.

People talk a lot about themselves!!

IntroductionIntroduction

Personal Awareness Streams

[2] C. Wagner and M. Strohmaier. The wisdom in tweetonomies: Acquiring latent conceptual structures from social awareness streams. In Proc. of the Semantic Search 2010 Workshop (SemSearch2010), april 2010..

Collection of semi-public, natural language message produced by a user and characterised by their brevity

IntroductionIntroduction

Can personal awareness streams convey meaningful information for modelling user context?

IntroductionIntroduction

Modelling User Context

People

Location Things

IntroductionIntroduction

Modelling User Context

-Semantic - Spatial

- Social - Temporal

Relationships:

IntroductionIntroduction

Modelling User Context what for ???

M-F

8:00 9:00 13:00 17:00- 20:00

S-S

IntroductionIntroduction

Modelling User Context what for ???

M-F

8:00 9:00 13:00 17:00- 20:00

S-S

BLT offer, 500m

IntroductionIntroduction

Modelling User Context what for ???

M-F

8:00 9:00 13:00 17:00- 20:00

S-S

BLT offer, 500m

Tuna

IntroductionIntroductionModelling User Context what for ???

M-F

8:00 9:00 13:00 17:00- 20:00

S-S

SuggestedBy a,b,c

Related WorkRelated Work

• Java et al [ 3], present an analysis of Twitter which suggest that the differences in users’ network connection structures can be explained by the following types of user activities: information seeking, information sharing and social activity.

• Ramage et al [4], apply labelled Latent Dirichlet Allocation (LDA) for mapping content of the public Twitter feed into four dimensions including style and substance.

• Krishnamurthy et al [5] present a characterisation of Twitter social network, which includes patterns in geographic growth and user’s social activity.

Social Awareness Streams

Related WorkRelated Work

• Wagner and Strohmaier [2] introduce the Tweetonomy model- Formalisation of social awareness streams.- Based on lightweight associative ontologies.

• Stankovic et al [6], study conference related tweets. - Map tweets to talks an sub-events that they refer to.- Using linked data they derive additional knowledge about event

dynamics and user activities.

Social Awareness Streams Using Linked Data

Related WorkRelated WorkOur work differs from existing work in …

• Focus on deriving person-based lightweight ontologies from personal awareness stream; which enrich concepts and reveal structures that are meaningful to the owner of the stream.

• Analyse the content of the messages not only in terms of traditional resources as hashtags, and links, but also in terms of entities (e.g location, people, organisations and time).

•Present a methodology based on tensor analysis that allows the definition of entity-based context for deriving person-based ontologies.

Social Stream Social Stream Aggregation and Entity Based Concept Induction

U q1 q1={authorship}

Defining a Tweetonomy

Social Stream Social Stream Aggregation and Entity Based Concept Induction

M q2

q2={direct message}

U q1 q1={author}

Defining a Tweetonomy

Social Stream Social Stream Aggregation and Entity Based Concept Induction

M q2

q2={direct message}

U q1 q1={author}

R q3

Defining a Tweetonomy

q3={Links, Hash tags, Location, People,Places, Organisation}

Social Stream Social Stream Aggregation and Entity Based Concept Induction

M q2

q2={direct message}

U q1 q1={author}

T

q3

q3={Links, Hash tags, Location, People,Places, Organisation}

R

T U×M×R⊆

Defining a Tweetonomy

Social Stream Social Stream Aggregation and Entity Based Concept Induction

M q2

q2={direct message}

U q1 q1={author}

Defining a Tweetonomy

T

q3R

T U×M×R⊆

Function that assigns a temporal marker to each ternary edge.

ft

q3={Links, Hash tags, Location, People,Places, Organisation}

Social Stream Social Stream Aggregation and Entity Based Concept Induction

M q2

q2={direct message}

U q1 q1={author}

T

q3

q3={Links, Hash tags}

R

T U×M×R⊆

Function that assigns a temporal marker to each ternary edge.

ft

Tweetonomy

S={Uq1, Mq2, Rq3, T, ft}

Defining a Tweetonomy

Social Stream Social Stream Aggregation and Entity Based Concept Induction

Modelling User Context with a Tweetonomy

Location People Keyword

Sheffield @gigsandtours, @officialcallumw

Tickets, centre,visit, retail, destination

Leeds @gigsandtours, @officialcallumw

Tickets, centre,visit, retail, destination

Social Stream Social Stream Aggregation and Entity Based Concept Induction

Modelling User Context with a Tweetonomy

Social Stream Social Stream Aggregation and Entity Based Concept Induction

Modelling User Context with a Tweetonomy

@Johbinns @tony

therapy 0.045 0

alcohol 0.034 0

fan 0.012 0

work 0 0.08

Okp =(RkM)(MRp)

Social Stream Social Stream Aggregation and Entity Based Concept Induction

Modelling User Context with a Tweetonomy

@Johnbinns @tony

therapy 0.045 0

alcohol 0.034 0

fan 0.012 0

work 0 0.08

Okp =(RkM)(MRp)

@Johbinnstherapy

alcohol

fan

@Tony

work

Social Stream Social Stream Aggregation and Entity Based Concept Induction

Modelling User Context with a Tweetonomy

Sheffield Leeds

therapy 0.045 0.023

alcohol 0.034 0.012

fan 0.012 0

work 0.056 0

Okl =(RkM)(MRl)

Leedstherapy

alcohol

fan

Sheffield

work

work

alcohol

Social Stream Social Stream Aggregation and Entity Based Concept Induction

Modelling User Context with a Tweetonomy

Morning (7am-12:00pm)

Rest of the Day(12:00pm-6:59am)

therapy 0.0015 0.023

alcohol 0 0.062

fan 0.0012 0.03

work 0.066 0

Otl =(RtM)(MRt)

Morning therapy

fan

Rest of the Day

work

work

alcohol

Social Stream Social Stream Aggregation and Entity Based Concept Induction

Modelling User Context with a TweetonomyWhat are the concepts that emerge when analysing BigGayShaun in the context of Sheffield (Location), @Johnbinns (Person), during the evening?

Social Stream Social Stream Aggregation and Entity Based Concept Induction

Modelling User Context with a TweetonomyWhat are the concepts that emerge when analysing BigGayShaun in the context of Sheffield (Location), @Johnbinns (Person), during the evening?

Morning (7am-12:00pm)

Rest of the Day(12:00pm-6:59am)

therapy 0.0015 0.023

alcohol 0 0.062

fan 0.0012 0.03

work 0.066 0

Otl =(RtM)(MRt)

@Johnbinns @tony

therapy 0.045 0

alcohol 0.034 0

fan 0.012 0

work 0 0.08

Okp =(RkM)(MRp)

Sheffield Leeds

therapy 0.045 0.023

alcohol 0.034 0.012

fan 0.012 0

work 0.056 0

Okl =(RkM)(MRl)

Social Stream Social Stream Aggregation and Entity Based Concept Induction

Modelling User Context with a Tweetonomy

Given P lightweight ontologies characterising a user’s social streams consisting of N messages; we define a tensor O ∈RN×N×P consisting of frontal slices of the form Op=Bp BT

p with p=1, ..P ,where B is a bipartite ontology Op;

Social Stream Social Stream Aggregation and Entity Based Concept Induction

Modelling User Context with a TweetonomyGiven P lightweight ontologies characterising a user’s social streams consisting of N messages; we define a tensor O R∈ N×N×P consisting of frontal slices of the form Op=Bp BT

p

with p=1, ..P ,where B is a bipartite ontology Op;

What are the concepts that emerge when analysing BigGayShaun in the context of Sheffield (Location (1)), @Johnbinns (Person (2)), during the evening (Time (3))?

therapy alcohol fan work

therapy 0.002 0 .. ..

alcohol .. 0.0011 .. ..

fan .. … 0.0001 ..

work … .. … 0.0004

O(1) =Okl(Okl)T

therapy alcohol fan work

therapy 0.002 0 .. ..

alcohol .. 0.0011 .. ..

fan .. … 0.0001 ..

work … .. … 0.0004

O(2) =Okp(Okp)T

therapy alcohol fan work

therapy 0.002 0 .. ..

alcohol .. 0.0011 .. ..

fan .. … 0.0001 ..

work … .. … 0.0004

O(3) =Okt(Okt)T

Evaluation• Data Set

– Four active Microbloggers– From Jul - Sep 2010– From each message, entities where extracted using the

OpenCalais service.

Evaluation

EvaluationEvaluation

Concepts in the context of Hashtags-Places-Time

Evaluation• Data Set

– Four active Microbloggers– From Jul - Sep 2010– From each message, entities where extracted using the

OpenCalais service.• User-based evaluation:

Consulting the author of the social stream whose context-induced concepts are being mapped.

Evaluation

Evaluation• Data Set

– Four active Microbloggers– From Jul - Sep 2010– From each message, entities where extracted using the

OpenCalais service.• User-based evaluation:

Consulting the author of the social stream whose context-induced concepts are being mapped.

• Evaluated contexts : hashtag-time, location-people, and organisation-people

Evaluation

EvaluationEvaluation

Higher lexical diversity (K/M) leads to better MAP results (see Figure 3 b)), this is an expected result since CSISSA explores the way in which an entityis linked to another one through keywords.

EvaluationHighlights

- Users tend to forget what they’ve tweeted about.

- Entity relationships decay with time. - Users’ streaming topics’ relevance was in many cases volatile;further research is necessary to address these issues

Conclusions• Awareness streams can be used to model context by

leveraging the user’s entity affiliations.• In our experiments a fairly naive approach was taken by not

considering the ambiguity in which user’s can relate two entities with a keyword.

• Future work considers:– Introduction of concept disambiguation for tackling this issue.– Use this approach for merging user contexts in pervasive

environments.

Conclusions

ReferencesReferences[1] M. Naaman, J. Boase, and C. H. Lai. Is it really about me?: message content in social awareness streams. In CSCW ’10: Proceedings of the 2010 ACM conference on Computer supported cooperative work, pages 189–192, New York, NY, USA, 2010. ACM.[2] C. Wagner and M. Strohmaier. The wisdom in tweetonomies: Acquiring latent conceptual structures from social awareness streamshmaier.. In Proc. of the Semantic Search 2010 Workshop (SemSearch2010), april 2010..

[3] A. Java, X. Song, T. Finin, and B. Tseng. Why we twitter: understanding microblogging usageand communities. In WebKDD/SNA-KDD ’07: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, pages 56–65, New York, NY, USA, 2007. ACM.

[5] B.Krishnamurthy, P.Gill, and M.Arlitt. A few chirps about twitter. In WOSP’08: Proceedings of the first workshop on Online social networks, pages 19–24, New York, NY, USA, 2008.ACM.

[4] D. Ramage, D. Hall, R. Nallapati, and C. D. Manning. Labeled lda: a supervised topicmodel for credit attribution in multi-labeled corpora. In EMNLP ’09: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 248–256, Morristown, NJ, USA, 2009. Association for Computational Linguistics.

[6] M. R. M. Stankovic and P. Laublet. Mapping tweets to conference talks: A goldmine for semantics. In Proceedings of Social Data on the Web workshop, ISWC 2010. Shanghai, China. ISWC 2010, 2010.

SlideshareSlideShare

http://www.slideshare.net/ampaeli/modellingContext