Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008
-
Upload
martin-szomszor -
Category
Documents
-
view
476 -
download
0
description
Transcript of Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008
![Page 1: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/1.jpg)
TAGora: Semiotic Dynamics of Online Social Communities EU-IST-2006-034721
Semantic Modelling of User Interests Based
on Cross-Folksonomy AnalysisMartin Szomszor, Harith Alani, Kieron O’Hara, Nigel Shadbolt
University of Southampton
Iván Cantador Universidad Autonoma de Madrid
![Page 2: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/2.jpg)
Outline• Introduction and Motivation
– Why is your folksonomy interaction useful?– How could it be exploited?
• Architecture– Matching user accounts– Collecting Data– Tag Filtering– Profile Building
• Experiment and Evaluation• Conclusions and Future Work
![Page 3: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/3.jpg)
Introduction
delicious.comhttp://slashdot.org/
http://news.bbc.co.uk/
Dream Theater
Metallica
Rush
![Page 4: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/4.jpg)
Increasing number ofonline identities
• Recent Ofcom study found that UK adults have on average 1.6 profiles. 39% of those that have one profile have at least 2
• Many predict that in the near future, individuals will have in excess of 10 profiles– [Ofcom 2008] Social Networking: A quantative and qualitative
research report into attitudes, behaviours, and use.
![Page 5: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/5.jpg)
Profile of Interests
The Big Picturedelicious.com
![Page 6: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/6.jpg)
delicious.com
Profiles could be exported to other sites to improve recommendation quality
Profile of
Interests
Personalisation
Profiles could be used to support
personalised searching
Better user experience
![Page 7: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/7.jpg)
Consolidation and Integration
currency
travel
hotels
cuba
http://dbpedia.org/resource/Cuba
cuba
holiday
2008
http://dbpedia.org/resource/Travel
http://dbpedia.org/resource/Holiday
http://dbpedia.org/resource/Category:Tourism
![Page 8: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/8.jpg)
User Taggingdelicious.com
![Page 9: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/9.jpg)
delicious.com
Tag Clouds
![Page 10: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/10.jpg)
Tagging Variation
[1] Szomszor, M., Cantador, I. and Alani, H. (2008). Correlating User Profiles from Multiple Folksonomies. In: ACM Conference on Hypertext and Hypermedia, 2008 , Pittsburgh, Pennsylvania.
Raw Tags
Filtered Tags
![Page 11: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/11.jpg)
Architecture for Building Profiles of Interests
![Page 12: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/12.jpg)
Account Correlation
• Using Google’s Social Graph API
delicious.com
acco
unt h
omep
age
http://users.ecs.soton.ac.uk/mns2
![Page 13: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/13.jpg)
• Delicious– Custom python scripts
• Flickr– Using public API
• Only public information is harvested
Data Collection
![Page 14: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/14.jpg)
Tag Filtering Process
![Page 15: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/15.jpg)
• Three stage process:1. Identify Wikipedia page
• London is matched withhttp://en.wikipedia.org/wiki/London
2. Extract Category list• Host cities of the Summer Olympic Games | Host cities of the
Commonwealth Games | London | 1st century establishments | British capitals | Capitals in Europe | Port cities and towns in the United Kingdom
3. Select representative Categories• Only choose categories that match the tag string• Excludes spurious categories such as:
– Host cities of the Summer Olympic Games– Needs more sources
Creating User Profiles
![Page 16: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/16.jpg)
Profile of Interest
![Page 17: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/17.jpg)
Experiment Setup• Bootstrapped using 667,141 delicious
profiles obtained in previous work• Only accounts with a matching Flickr
profile and > 50 distinct tags were added• Final list contains 1,392 users
Delicious FlickrTotal Posts 1,134,527 Total Posts 2,215,913
Distinct Tags 138,028 Distinct Tags 307,182
![Page 18: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/18.jpg)
Evaluation
• Four evaluation procedures:– The performance of the tag filtering and
matching to Wikipedia Entries– The difference between the most common
categories found in delicious and Flickr– The amount learnt from merging profiles from
the two folksonomies– The accuracy of matching tags to Wikipedia
categories
![Page 19: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/19.jpg)
Tag Filtering and Matching
![Page 20: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/20.jpg)
Global Category View• What are the differences in the interests
that are learnt from each domain?
Delicious FlickrWikipedia Category Total Freq Wikipedia Category Total Freq
Design 69,215 Travel 51,674
Blogs 68,319 Australia 51,617
Music 45,063 London 46,623
Photography 41,356 Festivals 42,504
Tools 35,795 Music 40,943
Video 34,318 Cats 38,230
Arts 29,966 Holidays 37,610
Software 28,746 Family 37,100
Maps 26,912 Japan 36,513
Teaching 22,120 Concerts 35,374
Games 21,549 Surnames 34,947
How-to 19,533 Washington 33,924
Technology 18,032 Given Names 32,843
News 17,737 Dogs 32,206
Humor 15,816 Birthdays 22,290
![Page 21: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/21.jpg)
Learning More About Users• How much more can we learn by using
multiple profiles?
![Page 22: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/22.jpg)
Category Matching• How good is the category matching?• Take 100 random users and choose 1
Delicious tag and 1 Flickr tag• Classify tag into one of 3 classes:
– Correct– Unresolved (not matched to any category)– Ambiguous (Disambiguation required)
Correct Unresolved AmbiguousDelicious 66% 20% 14%
Flickr 63% 25% 12%
![Page 23: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/23.jpg)
Conclusions• We have proposed a novel method for the
creation of Profiles of Interest by exploiting an individual’s tagging activities across two popular folksonomy sites
• Frequently used tags often specify areas of interest but not always!– Common delicious tags are daily, toread, howto– Flickr tags often include names of people
• Expanding the analysis across folksonomies increases the amount learnt– On Average 15 new concepts per user
![Page 24: Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis @ ISWC2008](https://reader038.fdocuments.us/reader038/viewer/2022103109/53ff49908d7f7250208b4651/html5/thumbnails/24.jpg)
Future Work• Improve page matching
– 22.5% of sample tags unresolved• Handle disambiguation
– 13% of sample tags refer to ambiguous terms• Cooccurrence networks• Category hierarchy
• Increase network coverage– Already have the data to include Last.fm
• Understand which tags actually specify an interest of the individual– Filter out categories such as ‘Surname’