Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k =...

39
Part 4 Comparing Hypotheses about Sequential Data

Transcript of Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k =...

Page 1: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

Part 4

Comparing Hypotheses about Sequential Data

Page 2: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

2

Example: Human Navigation

● Humans prefer to navigate…– H1: over semantically similar websites

– H2: via self-loops (e.g., refreshing)

– H3: by using the structural link network

– H4: by preferring similar categories

– H5: by utilizing structural properties

– H6: by information scent

[West et al. IJCAI 2009], [Singer et al. IJSWIS 2013], [West & Leskovec WWW 2012], [Chi et al. CHI 2001]

Page 3: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

3

Example: Human Navigation

● Humans prefer to navigate…– H1: over semantically similar websites

– H2: via self-loops (e.g., refreshing)

– H3: by using the structural link network

– H4: by preferring similar categories

– H5: by utilizing structural properties

– H6: by information scent

[West et al. IJCAI 2009], [Singer et al. IJSWIS 2013], [West & Leskovec WWW 2012], [Chi et al. CHI 2001]

What is the relative plausibility of these

hypotheses given data?

Page 4: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

4

Example: Human Navigation

● Humans prefer to navigate…– H1: over semantically similar websites

– H2: via self-loops (e.g., refreshing)

– H3: by using the structural link network

– H4: by preferring similar categories

– H5: by utilizing structural properties

– H6: by information scent

[West et al. IJCAI 2009], [Singer et al. IJSWIS 2013], [West & Leskovec WWW 2012], [Chi et al. CHI 2001]

HypTrails[Singer et al. WWW 2015]

Page 5: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

5

HypTrails in a nutshell

● Goal: Express and compare hypotheses about sequences in a coherent research approach

● Method:– First-order Markov chain model

– Bayesian inference

● Idea:– Incorporate hypotheses as priors

– Utilize sensitivity of marginal likelihood on the prior

● Outcome: Partial ordering of hypotheses

Page 6: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

Structure of HypTrails

Page 7: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

7

Structure of HypTrails

MC ModelMC Model S1S1

S2S2 S3S3

1/2 1/2

1/32/3

1

Page 8: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

How to express hypotheses?

Page 9: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

How to express hypotheses?

As assumptions in parameters of Markov Chain model.

Page 10: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

10

Structural hypothesis

1/3

1

1/3

1

1/3

Page 11: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

11

Uniform hypothesis

1/3

Page 12: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

12

Structure of HypTrails

MC ModelMC Model

Hypothesis (H1)

Hypothesis (H1)

Belief in parameters

12

3

1

2

3

1

2

3

0.00 0.33 0.00 h3

1.00 0.33 1.00 h2

0.00 0.33 0.00 h1

h3 h2 h1

12

3

1

2

3

1

2

3

0.00 0.99 0.00 h3

3.01 0.99 3.01 h2

0.00 0.99 0.00 h1

h3 h2 h1

12

3

1

2

3

1

2

3

0.00 0.99 0.00 h3

0.01 0.99 0.01 h2

0.00 0.99 0.00 h1

h3 h2 h1

Belief in parameters

Page 13: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

13

Empirical observations

1.0

2/3

1/3

1

Page 14: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

14

Which hypothesis is the most plausible one?

Page 15: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

15

Bayesian model comparison:marginal likelihood

Page 16: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

16

Bayesian model comparison:marginal likelihood

Probability of parametersbefore observing data

Page 17: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

17

Bayesian model comparison:marginal likelihood

Probability of parametersbefore observing data

Hypothesis

Page 18: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

18

Structure of HypTrails

MC ModelMC Model

Hypothesis (H1)

Hypothesis (H1)

Belief in parameters

Prior (H1)Prior (H1)

Elicitation

Data (Trails)Data (Trails)

Marginal likelihood (H1)

Marginal likelihood (H1)

Influence

Influence

Page 19: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

19

How to elicit priors from expressed hypotheses?

Page 20: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

20

Conjugate Dirichlet prior

● Hyperparameters: pseudo counts

Page 21: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

21

Conjugate Dirichlet prior

● Hyperparameters: pseudo counts

Hypothesis parameters Dirichlet hyperparameters

Page 22: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

22

Elicitation

● Multiply row-normalized hypothesis matrix with concentration parameter k

● Higher k → stronger belief

● Additional proto-prior

Hypothesis parameters Dirichlet hyperparameters

Page 23: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

23

2 state example: Beta priorHypothesis:

Page 24: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

24

2 state example: Beta priorHypothesis:

k = 0

Page 25: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

25

2 state example: Beta priorHypothesis:

k = 1

Page 26: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

26

2 state example: Beta priorHypothesis:

k = 10

Page 27: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

27

2 state example: Beta priorHypothesis:

k = 100

Page 28: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

28

Example: Structural hypothesisprotoprior

Page 29: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

29

Structure of HypTrails

MC ModelMC Model

Hypothesis (H1)

Hypothesis (H1)

Dirichlet Prior (H1)

Dirichlet Prior (H1)

Data (Trails)Data (Trails)

Marginal likelihood (H1)

Marginal likelihood (H1)

Hypothesis (H2)

Hypothesis (H2)

Dirichlet Prior (H2)

Dirichlet Prior (H2)

Marginal likelihood (H2)

Marginal likelihood (H2)

Compare

Belief in parameters

Elicitation

Influence

Influence

Page 30: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

30

Example result: Last.fm

0 1 2 3 4

hypothesis weighting factor k

−1.55

−1.50

−1.45

−1.40

−1.35

−1.30

−1.25

−1.20

−1.15

−1.10

evid

ence

1e5

uniform

self-loop

track date

similarity

Higher plausibility

Higher belief

Page 31: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

31

Example result: Last.fm

0 1 2 3 4

hypothesis weighting factor k

−1.55

−1.50

−1.45

−1.40

−1.35

−1.30

−1.25

−1.20

−1.15

−1.10

evid

ence

1e5

uniform

self-loop

track date

similarity

Page 32: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

Hands-on jupyter notebook

Page 33: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

33

Further applications

● Ontology engineering – edit sequences [Walk et al. ISWC 2015]

● Real-world navigational trails– Flickr [Becker et al. SocialCom 2015]

– Taxi data [Espín-Noboa et al. WWW 2016]

– Car data [Atzmüller et al. WWW 2016]

● Wikipedia co-editing patterns[Samoilenko et al. 2016]

Page 34: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

34

Methodological extensions

● Detect and model heterogeneity in data

● Higher-order Markov chain models

● Adaption for other models

Page 35: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

35

What have we learned?

● Comparing hypotheses about sequential data

● Bayesian approach: HypTrails

● Applications

Page 36: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

Questions?

Page 37: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

for your attention!

TT

HH

AA

NN

KK

SS

@ph_singerwww.philippsinger.info florian.lemmerich.net

Page 38: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

38

References 1/2

[West et al. WWW 2015] Robert West, Ashwin Paranjape, and Jure Leskovec: Mining Missing Hyperlinks from Human Navigation Traces: A Case Study of Wikipedia. 24th International World Wide Web Conference (WWW'15), Florence, Italy, 2015.

[Singer et al. IJSWIS 2013] Philipp Singer, Thomas Niebler, Markus Strohmaier and Andreas Hotho, Computing Semantic Relatedness from Human Navigational Paths: A Case Study on Wikipedia, International Journal on Semantic Web and Information Systems (IJSWIS), vol 9(4), 41-70, 2013

[West & Leskovec WWW 2012] Robert West and Jure Leskovec: Human Wayfinding in Information Networks 21st International World Wide Web Conference (WWW'12), pp. 619–628, Lyon, France, 2012.

[Chi et al. CHI 2001] Chi, Ed H., et al. "Using information scent to model user information needs and actions and the Web." Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 2001.

[Singer et al. WWW 2015] Singer, P., Helic, D., Hotho, A., and Strohmaier, M. (2015, May). Hyptrails: A bayesian approach for comparing hypotheses about human trails on the web. In Proceedings of the 24th International Conference on World Wide Web (pp. 1003-1013). International World Wide Web Conferences Steering Committee.

[Walk et al. ISWC 2015] Simon Walk, Philipp Singer, Lisette Espín Noboa, Tania Tudorache, Mark A. Musen and Markus Strohmaier, Understanding How Users Edit Ontologies: Comparing Hypotheses About Four Real-World Projects, 14th International Semantic Web Conference, Betlehem, Pennsylvania, USA, 2015

Page 39: Part 4 Comparing Hypotheses about Sequential Data · 27 2 state example: Beta prior Hypothesis: k = 100. 28 Example: Structural hypothesis proto prior. 29 ... 33 Further applications

39

References 2/2

[Becker et al. SocialCom 2015] Martin Becker, Philipp Singer, Florian Lemmerich, Andreas Hotho, Denis Helic and Markus Strohmaier, Photowalking the City: Comparing Hypotheses About Urban Photo Trails on Flickr, 7th International Conference on Social Informatics, Beijing, China, 2015

[Espín-Noboa et al. WWW 2016] Lisette Espín-Noboa, Florian Lemmerich, Philipp Singer and Markus Strohmaier, Discovering and Characterizing Mobility Patterns in Urban Spaces: A Study of Manhattan Taxi Data, 6th International Workshop on Location and the Web at WWW2016, Montreal, Canada, 2016

[Samoilenko et al. 2016] Samoilenko, A., Karimi, F., Edler, D., Kunegis, J., & Strohmaier, M. (2016). Linguistic neighbourhoods: explaining cultural borders on Wikipedia through multilingual co-editing activity. EPJ Data Science, 5(1), 1., 2016

[Atzmüller et al. WWW 2016] Atzmueller, M., Schmidt, A., & Kibanov, M. (2016). DASHTrails: An Approach for Modeling and Analysis of Distribution-Adapted Sequential Hypotheses and Trails. In Proceedings of the World Wide Web Conference Companion, 2016