A Markov Random Field Model for Term Dependencies Chetan Mishra CS 6501 Paper Presentation Ideas,...

18
A Markov Random Field Model for Term Dependencies Chetan Mishra CS 6501 Paper Presentation Ideas, graphs, charts, and results from paper of same name by Metzler and Croft 2005 (SIGIR)

Transcript of A Markov Random Field Model for Term Dependencies Chetan Mishra CS 6501 Paper Presentation Ideas,...

A Markov Random Field Model for Term Dependencies

Chetan MishraCS 6501 Paper Presentation

Ideas, graphs, charts, and results from paper of same name by Metzler and Croft 2005 (SIGIR)

CS 6501: Text Mining 2

Agenda

1. Motivation behind the work2. Background – What is a Markov Random

Field (MRF)?3. Research Insight – How did the authors use

MRF to model term dependencies? Results?4. Future Work – If you thought this was

interesting, how could you build on this?5. Conclusion

CS@UVa

Agenda Motivation Background Research Insight Future Work Conclusion

CS 6501: Text Mining 3

Motivation

• Terms are not independently distributed– A model incorporating term dependencies should

outperform a model that ignores them• One problem: models incorporating term

dependencies seemed no better or worse– Statistical models weren’t effectively modeling

term dependencies– Why?

CS@UVa

Agenda Motivation Background Research Insight Future Work Conclusion

CS 6501: Text Mining 4

Motivation

• Two Problems (perspective of authors):– Problem 1: Most models have taken bag of word-

like approaches (which have tremendous data requirements)

– Solution 1: We need a new type of model– Problem 2: Term dependency modeling (even with

a reasonable model) requires a significant corpus– Solution 2: Add to research testing collections

large, web-scraped corpuses

CS@UVa

Agenda Motivation Background Research Insight Future Work Conclusion

CS 6501: Text Mining 5

Background

• What is a Markov random field (MRF) model?– Fancy name for a bidirectional graph-based model– Often used in machine learning to succinctly

model joint distributions• MRF models are used in the paper to tackle

the problem of document retrieval with response to a query

CS@UVa

Agenda Motivation Background Research Insight Future Work Conclusion

CS 6501: Text Mining 6

Model Overview

• Problem: Find documents that are relevant to a query – Imagine there’s a set of documents relevant with

respect to each query. – We will model the probability of a document

being relevant to a query with – Model will be providing user a ranked list of

CS@UVa

Agenda Motivation Background Research Insight Future Work Conclusion

CS 6501: Text Mining 7

Model Overview

• We will be modeling

• , also called the “potential function”– Identity depends widely on problem one is solving– Non-negative

CS@UVa

Joint distribution of query and document is the set of

cliques in MRF .

is a measure of the compatibility of the term(s) passed in with the topic of

Agenda Motivation Background Research Insight Future Work Conclusion

CS 6501: Text Mining 8

Model Overview

• Model output needs to be ranked– Since only order matters we can make the

following simplifications:

Since ,

CS@UVa

By the joint probability law

Agenda Motivation Background Research Insight Future Work Conclusion

CS 6501: Text Mining 9

The Markov Random Field Model

• What is ?– The Markov random field

• What does contain?– A document node and a node for each term in the

query ()– Edges between all and – Edges between each and that are not sufficiently

independent

CS@UVa

Agenda Motivation Background Research Insight Future Work Conclusion

CS 6501: Text Mining 10

The Markov Random Field Model

• “Edges between each and that are not sufficiently independent.”

• Recall – This means that represents a set of completely

mutually dependent words and the document. – Ranking ranks the sum of independent subsets of

CS@UVa

Agenda Motivation Background Research Insight Future Work Conclusion

CS 6501: Text Mining 11

The Markov Random Field Model

• The paper looks at the performance of three general types of dependencies: – Independence– Sequential dependence – Full dependence

• Visual Depiction:

CS@UVa

Metzler and Croft ‘05

Agenda Motivation Background Research Insight Future Work Conclusion

CS 6501: Text Mining 12

Potential Functions

• How do we measure how relevant a set of mutually dependent words is to a document?– If one word, – If >1 word and the sequence is unobserved

– If >1 word and the sequence is observed

CS@UVa

is a smoothed, term frequency

= smoothed, sequence count of

appearing unordered within extra terms

= smoothed, ordered sequence count of in

Agenda Motivation Background Research Insight Future Work Conclusion

All log scale!

CS 6501: Text Mining 13

Parameter Training

• Don’t use Maximum Likelihood Estimation. Why?– sample space extremely large compared to

training data – Unlikely MLE estimate would be accurate

• Instead let’s maximize our accuracy metric– And let’s say:

CS@UVa

Agenda Motivation Background Research Insight Future Work Conclusion

CS 6501: Text Mining 14

Parameter Training

• What optimization technique do we use?– Authors found a shape common to the metric

surface via parameter sweepA hill-climbing

search should work well

CS@UVa

Agenda Motivation Background Research Insight Future Work Conclusion

CS 6501: Text Mining 15

Results

• Did MRF’s help?

– I’d say so. Significant gains across data sets

CS@UVa

Agenda Motivation Background Research Insight Future Work Conclusion

Independent Sequential Dependence Full Dependence

CS 6501: Text Mining 16

Future Work

• Query expansion – If we know a document relates to a query we just

received, how can be expand the query?• Statistical techniques to indicate which terms

should be declared “dependent”– Perhaps based on expected mutual information

measure

CS@UVa

Agenda Motivation Background Research Insight Future Work Conclusion

Conclusion

1. Motivation behind the work2. Background – What is a Markov Random

Field Model (MRF)?3. Research Insight – How did the authors use

MRF to model term dependencies? Results?4. Future Work – If you thought this was

interesting, how could you build on this?5. Conclusion

CS@UVa CS 6501: Text Mining 17

Agenda Motivation Background Research Insight Future Work Conclusion

CS 6501: Text Mining 18

Questions?

CS@UVa