An Epistemic Dynamic Model for Tagging Systems

24
<is web> Information Systems & Semantic Web University of Koblenz ▪ Landau, Germany An Epistemic Dynamic Model for Tagging Systems Klaas Dellschaft and Steffen Staab {klaasd, staab}@uni-koblenz.de

description

This presentation is about our paper which was presented at the Hypertext conference 2008. Abstract: In recent literature, several models were proposed for reproducing and understanding the tagging behavior of users. They all assume that the tagging behavior is influenced by the previous tag assignments of other users. But they are only partially successful in reproducing characteristic properties found in tag streams. We argue that this inadequacy of existing models results from their inability to include user’s background knowledge into their model of tagging behavior. This paper presents a generative tagging model that integrates both components, the background knowledge and the influence of previous tag assignments. Our model successfully reproduces characteristic properties of tag streams. It even explains effects of the user interface on the tag stream. Links to the paper: http://dx.doi.org/10.1145/1379092.1379109 http://www.uni-koblenz.de/~staab/Research/Publications/2008/DellschaftStaabHypertext08.pdf

Transcript of An Epistemic Dynamic Model for Tagging Systems

Page 1: An Epistemic Dynamic Model for Tagging Systems

<is web> Information Systems & Semantic Web

University of Koblenz ▪ Landau, Germany

An Epistemic Dynamic Model for Tagging Systems

Klaas Dellschaft and Steffen Staab{klaasd, staab}@uni-koblenz.de

Page 2: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 20082 of 24

Delicious Tagging Interface

How do users influence each other in tagging systems? Hypothesis: Tagging involves imitation of other users AND

selection of tags from background knowledge of users.

Page 3: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 20083 of 24

Understanding Dynamics in Tagging Systems

Conceptualization

Own Background Knowledge

Shared terminology

Something else?User interface

Tagging Behavior

Joint Stochastic Model

Model of Own Background Knowledge

Model of Sharing

Model of User Interface Influence

Simulated Tagging Behavior

Co

mp

ariso

n o

f Sta

tistics

Page 4: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 20084 of 24

Folksonomies

Vertexes: Users, tags, resources Hyperedges: Tag assignments (user X tag X resource) Postings:

Tag assignments of a user to a single resource Can be ordered according to their time-stamp

Page 5: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 20085 of 24

Co-occurrence Streams

Co-occurrence Streams: All tags co-occurring with a given tag in a posting Ordered by posting time

Example tag assignments for ‘ajax': {mackz, r1, {ajax, javascript}, 13:25} {klaasd, r2, {ajax, rss, web2.0}, 13:26} {mackz, r2, {ajax, php, javascript}, 13:27}

Resulting co-occurrence stream:

Tag |Y| |U| |T| |R|ajax 2.949.614 88.526 41.898 71.525blog 6.098.471 158.578 186.043 557.017xml 974.866 44.326 31.998 61.843

javascript rss web2.0 php javascript

time

Page 6: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 20086 of 24

Co-occurrence Streams – Tag Growth

Linear scaling of axes

Logarithmic scaling of axes

Page 7: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 20087 of 24

Co-occurrence Streams – Tag Frequencies

Logarithmic scaling of axes

Page 8: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 20088 of 24

Resource Streams

Resource Streams: All tags assigned to a resource Ordered by posting time

Example tag assignments: {mackz, r1, {ajax, javascript}, 13:25} {klaasd, r2, {ajax, rss, web2.0}, 13:26} {mackz, r2, {ajax, php, javascript}, 13:27}

Resulting resource stream:

ajax rss web2.0 ajax php javascripttime

Page 9: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 20089 of 24

Resource Streams – Tag Frequencies

Logarithmic scaling of axes

Page 10: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 200810 of 24

Resource Streams – Tag Frequencies

Taken from Halpin, Robu and Shepard, WWW 2007

Logarithmic scaling of axes

Page 11: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 200811 of 24

Existing Models

Frequency Rank

Co-occur. Streams Resource Streams Tag Growth

Polya Urn Model O O Fixed size

Simon Model O O Linear

YS Model w/ Memory + O Linear

Halpin et al. Model O O Linear

Observed Properties Long tail w/ cut-off Drop around tag 7 Sublinear

Page 12: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 200812 of 24

Stochastic Models of Influence

Conceptualization

Own Background Knowledge

Shared terminology

Something else?User interface

Tagging Behavior

Joint Stochastic Model

Model of Own Background Knowledge

Model of Sharing

Model of User Interface Influence

Simulated Tagging Behavior

Co

mp

ariso

n o

f Sta

tistics

Page 13: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 200813 of 24

Modeling Tag Imitation

Delicious Tagging Interface Imitation of previous tag assignments Free input of new tags (taken from background knowledge of a user)

Page 14: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 200814 of 24

Simulating a Tag Stream

p(w|t): Probability of selecting word w for topic t. Modeled by word distributions in a topic centered text corpus.

n: Number of visible previous tags.

h: Maximal number of previous tag assignments used for determining ranking of the n distinct tags.

P BK

1-PBK

Start with empty tag stream Each simulation step appends a new tag assignment:

Page 15: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 200815 of 24

Modeling Background Knowledge

PBK: Probability of selecting from background knowledge p(w|t): Probability of selecting word w for topic t. Modeled by

word distributions in a topic centered text corpus. p(w|r): Probability of selecting word w for resource r.

Text Corpora Del.icio.usText Corpora

Page 16: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 200816 of 24

Modeling Tag Imitation

PI = 1 - PBK: Probability of imitating a previous tag assignment n: Number of visible top-ranked tags h: Maximal number of previous tag assignments used for determining

ranking of the n distinct tags

PBK t t-1 t-2 t-3 t-4 t-5 … …t-h

1-PBK

1 2 3 … n

Page 17: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 200817 of 24

Simulation Results

Conceptualization

Own Background Knowledge

Shared terminology

Something else?User interface

Tagging Behavior

Joint Stochastic Model

Model of Own Background Knowledge

Model of Sharing

Model of User Interface Influence

Simulated Tagging Behavior

Co

mp

ariso

n o

f Sta

tistics

Page 18: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 200818 of 24

Simulating Co-occurrence Streams

Tag growth: Influenced by PBK and p(w|t)

Tag Frequencies: Influenced by PBK, p(w|t), n, h n: Semantic breadth of a topic (blog: 100 tags,

ajax: 50 tags, xml: 50 tags; Cattuto et al. 2007) h: No hint for realistic values. Good guess may be a value

around 1000.

Page 19: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 200819 of 24

Co-occ. Streams – Simulated Tag Growth

Logarithmic scaling of axes

PI PBKPI: Imitation

PBK: Background Knowl.

Page 20: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 200820 of 24

Co-occ. Streams – Simulated Tag Frequencies

Logarithmic scaling of axes

PI PBK

PI PBK

PI PBK

PI: Imitation

PBK: Background Knowl.

n: Visible Tags

h: Available History

Page 21: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 200821 of 24

Simulating Ajax, Blog and XML

blog

ajax

xml

Logarithmic scaling of axes

ObservationSimulation

(PI

(PI

(PI

PI: Imitation

PBK: Background Knowl.

n: Visible Tags

h: Available History

Page 22: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 200822 of 24

Res. Streams – Simulated Tag Frequencies

Logarithmic scaling of axes

PI PBK

PI PBK

PI: Imitation

PBK: Background Knowl.

n: Visible Tags

h: Available History

Page 23: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 200823 of 24

Conclusions

• Imitation AND background knowledge are needed for explaining properties of tag streams

• Probability of imitating previous tag assignments: ~60-90%

Frequency Rank

Co-occur. Streams Resource Streams Tag Growth

Polya Urn Model O O Fixed size

Simon Model O O Linear

YS Model w/ Memory + O Linear

Halpin et al. Model O O Linear

Epistemic Model + + Sublinear

Observed Properties Long tail w/ cut-off Drop around tag 7 Sublinear

Page 24: An Epistemic Dynamic Model for Tagging Systems

<is web>

ISWeb - Information Systems & Semantic Web

Klaas [email protected]

Hypertext 200824 of 24

Data sets and simulation software:http://isweb.uni-koblenz.de/Research/Tagdataset

http://www.tagora-project.eu